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Preface 



These notes cover the material from the second half of a two-semester se- 
quence of mathematical methods courses given to first year physics graduate 
students at the University of Illinois. They consist of three loosely connected 
parts: i) an introduction to modern "calculus on manifolds", the exterior 
differential calculus, and algebraic topology; ii) an introduction to group rep- 
resentation theory and its physical applications; iii) a fairly standard course 
on complex variables. 
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Chapter 1 

Tensors in Euclidean Space 



In this chapter we explain how a vector space V gives rise to a family of 
associated tensor spaces, and how mathematical objects such as linear maps 
or quadratic forms should be understood as being elements of these spaces. 
We then apply these ideas to physics. We make extensive use of notions and 
notations from the appendix on linear algebra, so it may help to review that 
material before we begin. 

1.1 Covariant and Contravariant Vectors 

When we have a vector space V over R, and {e 1; e 2 , . . . , e n } and {e^, e' 2 , . . . , e^} 
are both bases for V, then we may expand each of the basis vectors e M in 
terms of the e' as 



We are here, as usual, using the Einstein summation convention that repeated 
indices are to be summed over. Written out in full for a three-dimensional 
space, the expansion would be 





a\e[ + a\e' 2 + afe' 3 , 



We could also have expanded the e' 



in terms of the e u as 




(1.2) 
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As the notation implies, the matrices of coefficients and (a are inverses 
of each other: 

a^a'X = (OX = e (1-3) 

If we know the components x^ of a vector x in the e M basis then the compo- 
nents x' M of x in the e' basis are obtained from 

x = A; = x u e u = (x u a" v ) e' M (1.4) 

by comparing the coefficients of e' . We find that x'^ = a^x v . Observe how 
the e M and the x^ 1 transform in "opposite" directions. The components x^ 
are therefore said to transform contravariantly . 

Associated with the vector space V is its dual space V* , whose elements 
are covectors, i.e. linear maps f : V — > R. If f e V* and x = x^e^, we use 
the linearity property to evaluate f(x) as 

f (x) = f (z%) = off (e M ) = ^ / M . (1.5) 

Here, the set of numbers / M = f(e jU ) are the components of the covector f. If 
we change basis so that e u = a^e'^ then 

/, = f (e„) = f « e y = <f ( e y = </;. (i.e) 

We conclude that f u = a%f'. The f ^ components transform in the same man- 
ner as the basis. They are therefore said to transform covariantly . In physics 
it is traditional to call the the set of numbers x^ with upstairs indices (the 
components of) a contravariant vector. Similarly, the set of numbers f ^ with 
downstairs indices is called (the components of) a covariant vector. Thus, 
contravariant vectors are elements of V and covariant vectors are elements 
of V*. 

The relationship between V and V* is one of mutual duality, and to 
mathematicians it is only a matter of convenience which space is V and 
which space is V*. The evaluation of f G V* on x e V is therefore often 
written as a "pairing" (f, x), which gives equal status to the objects being 
put togther to get a number. A physics example of such a mutually dual pair 
is provided by the space of displacements x and the space of wave-numbers 
k. The units of x and k are different (meters versus meters -1 ). There is 
therefore no meaning to "x + k," and x and k are not elements of the same 
vector space. The "dot" in expressions such as 



^(x) = e ik ' x 



(1.7) 
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cannot be a true inner product (which requires the objects it links to be in 
the same vector space) but is instead a pairing 

(k,x) = k(x) = fc M x". (1.8) 

In describing the physical world we usually give priority to the space in which 
we live, breathe and move, and so treat it as being "V". The displacement 
vector x then becomes the contravariant vector, and the Fourier-space wave- 
number k, being the more abstract quantity, becomes the covariant covector. 

Our vector space may come equipped with a metric that is derived from 
a non-degenerate inner product. We regard the inner product as being a 
bilinear form g : V x V — > R, so the length ||x|| of a vector x is a/§( x > x). 
The set of numbers 

g»u = g(e^,e u ) (1.9) 

comprises the (components of) the metric tensor. In terms of them, the 
inner of product (x, y) of pair of vectors x = x^e^ and y = y^e^ becomes 

(x,y)=g(x,y)=^V- (1-10) 

Real-valued inner products are always symmetric, so g(x, y) = g(y,x) and 
5V = 9vn- As t ne product is non-degenerate, the matrix g^ v has an inverse, 
which is traditionally written as g^ u . Thus 

g^g uX = g Xu g^ = ^. (l.n) 

The additional structure provided by the metric permits us to identify V 
with V*. The^ identification is possible, because, given any f e V*, we can 
find a vector f e V such that 

f(x) = (f,x). (1.12) 
We obtain f by solving the equation 

U = g,vf u (1.13) 

to get f u = g Ufl f f j,. We may now drop the tilde and identify f with f, and 
hence V with V*. When we do this, we say that the covariant components 
ffx are related to the contravariant components / M by raising 



r = g" v fu, 



(1.14) 



4 



CHAPTER 1. TENSORS IN EUCLIDEAN SPACE 



or lowering 

U = g,»r, (1.15) 

the index fi using the metric tensor. Bear in mind that this V = V* identi- 
fication depends crucially on the metric. A different metric will, in general, 
identify an f e V* with a completely different f e V. 

We may play this game in the Euclidean space E n with its "dot" inner 
product. Given a vector x and a basis e M for which g^ v = e M ■ e„, we can 
define two sets of components for the same vector. Firstly the coefficients x^ 
appearing in the basis expansion 

x = z%, (1.16) 

and secondly the "components" 

Xf, = e M • x = g(e M , x) = g(e M , a^c) = g(e M , e^x" = g^x u (1.17) 

of x along the basis vectors. These two set of numbers are then respectively 
called the contravariant and covariant components of the vector x. If the 
e M constitute an orthonormal basis, where g^ v = 5^ u , then the two sets of 
components (covariant and contravariant) are numerically coincident. In a 
non-orthogonal basis they will be different, and we must take care never to 
add contravariant components to covariant ones. 

1.2 Tensors 

We now introduce tensors in two ways: firstly as sets of numbers labelled by 
indices and equipped with transformation laws that tell us how these numbers 
change as we change basis; and secondly as basis-independent objects that 
are elements of a vector space constructed by taking multiple tensor products 
of the spaces V and V*. 

1.2.1 Transformation rules 

After we change basis e M — > e' M , where e u = a^e' M , the metric tensor will be 
represented by a new set of components 



9L = g(e' e'J. 



(1.18) 
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These are be related to the old components by 

9,w = g(e M ,e^) = g(a£e' p , <0 = a^als(e' p , e' a ) = ajX g' pa . (1.19) 

This transformation rule for g^ has both of its subscripts behaving like the 
downstairs indices of a covector. We therefore say that g^ u transforms as a 
doubly covariant tensor. Written out in full, for a two-dimensional space, 
the transformation law is 

11/ i 12 / i 21/ i 22/ 

911 = a 1 a 1 g n + a 1 a 1 g 12 + a x a x g 2X + a x a x g 22 , 

11/ i 12 / i 21/ i 22/ 

912 = a 1 a 2 g 11 + a 1 a 2 g 12 + a x a 2 g 21 + a x a 2 g 22) 
g 21 = ala{g' u + a\a\g' l2 + a\a\g 21 + a\a\g' 22 , 

11/ i 12 / i 21/ i 22/ 

g 22 = a 2 a 2 g n + a 2 a 2 g 12 + a 2 a 2 g 21 + a 2 a 2 g 22 . 

In three dimensions each row would have nine terms, and sixteen in four 
dimensions. We see why Einstein was driven to invent his summation con- 
vention! 

A set of numbers Q a/3 l6e , whose indices range from 1 to the dimension of 
the space and that transforms as 

Q a % Se = {a-X>{*-% a^4'aiQ' a ^' YS!e! , (1.20) 

or conversely as 

comprises the components of a doubly contravariant, triply covariant tensor. 
More compactly, the Q a/3 lSe are the components of a tensor of type (2,3). 
Tensors of type (p, q) are defined analogously. The total number of indices 
p + q is called the rank of the tensor. 

Note how the indices are wired up in the transformation rules (1.20) and 
(1.21): free (not summed over) upstairs indices on the left hand side of the 
equations match to free upstairs indices on the right hand side, similarly for 
the downstairs indices. Also upstairs indices are summed only with down- 
stairs ones. 

Similar conditions apply to equations relating tensors in any particular 
basis. If they are violated you do not have a valid tensor equation — meaning 
that an equation valid in one basis will not be valid in another basis. Thus 
an equation 

A\, = B»\ XT + C» uX (1.22) 
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is fine, but 




(1.23) 



has something wrong in each term. 

Incidentally, although not illegal, it is a good idea not to write tensor 
indices directly underneath one another — i.e. do not write Q l ^ l — because 
if you raise or lower indices using the metric tensor, and some pages later in 
a calculation try to put them back where they were, they might end up in 
the wrong order. 

Tensor algebra 

The sum of two tensors of a given type is also a tensor of that type. The sum 
of two tensors of different types is not a tensor. Thus each particular type of 
tensor constitutes a distinct vector space, but one derived from the common 
underlying vector space whose change-of-basis formula is being utilized. 

Tensors can be combined by multiplication: if A^ vX and B^ vXt are tensors 
of type (1,2) and (1,3) respectively, then 



is a tensor of type (2, 5). 

An important operation is contraction, which consists of setting one or 
more contravariant index index equal to a covariant index and summing over 
the repeated indices. This reduces the rank of the tensor. So, for example, 



is a tensor of type (0, 3). Similarly f (x) = is a type (0, 0) tensor, i.e. an 
invariant — a number that takes the same value in all bases. Upper indices 
can only be contracted with lower indices, and vice versa. For example, the 
array of numbers A a = B a pp obtained from the type (0, 3) tensor B a/37 is not 
a tensor of type (0, 1). 

The contraction procedure outputs a tensor because setting an upper 
index and a lower index to a common value \i and summing over /x, leads to 
the factor . . . (a _1 )^a^ . . . appearing in the transformation rule. Now 




par 



(1.24) 




(1.25) 





and the Kronecker delta effects a summation over the corresponding pair of 
indices in the transformed tensor. 
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Although often associated with general relativity, tensors occur in many 
places in physics. They are used, for example, in elasticity theory, where the 
word "tensor" in its modern meaning was introduced by Woldemar Voigt 
in 1898. Voigt, following Cauchy and Green, described the infinitesimal 
deformation of an elastic body by the strain tensor e a p, which is a tensor 
of type (0,2). The forces to which the strain gives rise are described by the 
stress tensor <t Am . A generalization of Hooke's law relates stress to strain via 
a tensor of elastic constants c a ^ 5 as 

(1.27) 

We study stress and strain in more detail later in this chapter. 

Exercise 1.1: Show that g^", the matrix inverse of the metric tensor </„„, is 
indeed a doubly contravariant tensor, as the position of its indices suggests. 



1.2.2 Tensor character of linear maps and quadratic 
forms 

As an illustration of the tensor concept and of the need to distinguish be- 
tween upstairs and downstairs indices, we contrast the properties of matrices 
representing linear maps and those representing quadratic forms. 

A linear map M : V — > V is an object that exists independently of any 
basis. Given a basis, however, it is represented by a matrix M^ v obtained 
by examining the action of the map on the basis elements: 

M(e M ) = e v M\. (1.28) 

Acting on x we get a new vector y = M(x), where 

y v e v = y = M(x) = M(z%) = ^M(e M ) = x»M\e y = r/ e u . (1.29) 

We therefore have 

y v = M^x", (1.30) 

which is the usual matrix multiplication y = Mx. When we change basis, 
e„ = a^e' M , then 

e v M\ = M(e M ) = M(a^e' p ) = a^M(e' p ) = afeM% = a^Y^M^. 

(1-31) 
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Comparing coefficients of e u , we find 

M\ = al(a-y a M"> p , (1.32) 

or, conversely, 

M'\ = (a~X<M%. (1.33) 

Thus a matrix representing a linear map has the tensor character suggested 
by the position of its indices, i.e. it transforms as a type (1, 1) tensor. We can 
derive the same formula in matrix notation. In the new basis the vectors x 
and y have new components x' = Ax, and y' = Ay. Consequently y = Mx 
becomes 

y' = Ay = AMx = AMA- 1 x', (1.34) 
and the matrix representing the map M has new components 

M' = AMA" 1 . (1.35) 

Now consider the quadratic form Q : V — > K. that is obtained from a 
symmetric bilinear form Q : V x V — > K. by setting Q(x) = Q(x, x). We can 
write 

Q(x) = Q^x v = x"Q^ x v = x T Qx, (1.36) 

where Q^ v = Q(e M , e^) are the entries in the symmetric matrix Q, the suffix T 
denotes transposition, and x T Qx is standard matrix-multiplication notation. 
Just as does the metric tensor, the coefficients Q^ u transform as a type (0, 2) 
tensor: 

Qiw = °>>iQ'«p- (1-37) 
In matrix notation the vector x again transforms to have new components 
x' = Ax, but x /T = x T A T . Consequently 

x' T Qx = x T A T Q'Ax. (1.38) 

Thus 

Q = A T Q'A. (1.39) 

The message is that linear maps and quadratic forms can both be represented 
by matrices, but these matrices correspond to distinct types of tensor and 
transform differently under a change of basis. 

A matrix representing a linear map has a basis-independent determinant. 
Similarly the trace of a matrix representing a linear map 



tr M = ikP, 



(1.40) 
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is a tensor of type (0,0), i.e. a scalar, and therefore basis independent. On 
the other hand, while you can certainly compute the determinant or the trace 
of the matrix representing a quadratic form in some particular basis, when 
you change basis and calculate the determinant or trace of the transformed 
matrix, you will get a different number. 

It is possible to make a quadratic form out of a linear map, but this 
requires using the metric to lower the contravariant index on the matrix 
representing the map: 

Q(x) = x'VQV* = x • Qx. (1.41) 

Be careful, therefore: the matrices "Q" in x T Qx and in x-Qx are representing 
different mathematical objects. 

Exercise 1.2: In this problem we will use the distinction between the trans- 
formation law of a quadratic form and that of a linear map to resolve the 
following "paradox": 

• In quantum mechanics we are taught that the matrices representing two 
operators can be simultaneously diagonalized only if they commute. 

• In classical mechanics we are taught how, given the Lagrangian 




to construct normal co-ordinates Qi such that L becomes 

^ = E (!«->?)■ 

i 

We have apparantly managed to simultaneously diagonize the matrices My — > 
diag (1, . . . , 1) and Vij — > diag (u>f, even though there is no reason for 

them to commute with each other! 

Show that when M and V are a pair of symmetric matrices, with M being 
positive definite, then there exits an invertible matrix A such that A MA and 
A T VA are simultaneously diagonal. (Hint: Consider M as defining an inner 
product, and use the Gramm-Schmidt procedure to first find a orthonormal 
frame in which M-j = 5ij. Then show that the matrix corresponding to V 
in this frame can be diagonalized by a further transformation that does not 
perturb the already diagonal M--.) 
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1.2.3 Tensor product spaces 

We may regard the set of numbers Q al3 l6t as being the components of an 
object Q that is element of the vector space of type (2, 3) tensors. We 
denote this vector space by the symbol V <8> V <g> V* <g> V* <8> V*, the notation 
indicating that it is derived from the original V and its dual V* by taking 
tensor products of these spaces. The tensor Q is to be thought of as existing 
as an element of V <g> V <8> V* <g> V* <g> V* independently of any basis, but given 
a basis {e M } for V, and the dual basis {e* v } for V*, we expand it as 

Q = Q a \ Se e a <g> ep <g> e*^ ® e* 5 ® e* e . (1.42) 

Here the tensor product symbol "(g)" is distributive 

a<g(b + c) = a(g>b + a(g)c, 

(a + b)(gc = a<g>c + b<g>c, (1.43) 

and associative 

(a® b) ® c = a® (b(8) c), (1.44) 

but is not commutative 

a<g>b^b<g>a. (1.45) 
Everything commutes with the field, however, 

A(a® b) = (Aa) ®b = a® (Ab). (1.46) 

If we change basis e a = a^e'^ then these rules lead, for example, to 

e a ®e f) = a*a%e , x ®e , li . (1.47) 
From this change-of-basis formula, we deduce that 

T#ea ®e p = T# a y p e' x ® = T> X » e' x ® e' M , (1.48) 

where 

T' A/i = T^ala*. (1.49) 

The analogous formula for e a <g> <g> e* 7 ® e* 5 ® e* e reproduces the transfor- 
mation rule for the components of Q. 

The meaning of the tensor product of a collection of vector spaces should 
now be clear: If e M consititute a basis for V, the space V ®V is, for example, 
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the space of all linear combinations 1 of the abstract symbols e M <g> e u , which 
we declare by fiat to constitute a basis for this space. There is no geometric 
significance (as there is with a vector product a x b) to the tensor product 
a <S> b, so the e M <g> e„ are simply useful place-keepers. Remember that these 
are ordered pairs, e M <g> e„ ^ e u <g> e M . 

Although there is no geometric meaning, it is possible, however, to give 
an algebraic meaning to a product like e* A £g> e* M ® e* u by viewing it as a 
multilinear form V x V x V :— > R. We define 

e* A <g> e*" <g> e*" (e Q , e^, e 7 ) = 5 A b% S v r (1.50) 

We may also regard it as a linear map V ®V ®V :— > R by defining 

e* A <g> e*^ <g> e*" (e a ® e p ® e 7 ) = 5 x a 5% ^ (1.51) 

and extending the definition to general elements of V <E> V <8> by linearity. 
In this way we establish an isomorphism 

V* (g) <g) V* ^ (V ® \/ ® V)*. (1.52) 

This multiple personality is typical of tensor spaces. We have already seen 
that the metric tensor is simultaneously an element of V* <E> V* and a map 
g : V -> V*. 

Tensor products and quantum mechanics 

When we have two quantum-mechanical systems having Hilbert spaces 
and 1iS 2 \ the Hilbert space for the combined system is TiS 1 ^ ®H^. Quantum 
mechanics books usually denote the vectors in these spaces by the Dirac "bra- 
ket" notation in which the basis vectors of the separate spaces are denoted 
by 2 | rii) and fa), and that of the combined space by \ni, n 2 ). In this notation, 
a state in the combined system is a linear combination 

|*)= £ KnaXm.nal*), (1.53) 

ni,n 2 

1 Do not confuse the tensor-product space V ® W with the Cartesian product V x W. 
The latter is the set of all ordered pairs (x, y), x e V, y € W. The tensor product includes 
also formal sums of such pairs. The Cartesian product of two vector spaces can be given 
the structure of a vector space by defining an addition operation A(xi,yi) + yu(x 2 ,y2) = 
(Axi + yux 2 , Ayi + /uy2), but this construction does not lead to the tensor product. Instead 
it defines the direct sum V © W. 

2 We assume for notational convenience that the Hilbert spaces are finite dimensional. 
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This is the tensor product in disguise. To unmask it, we simply make the 
notational translation 

(ni,ra 2 |*) -> V ni ' na 



-> e 



(i) 



F2) -> e^ 2 



,(2) 

Im.na) - e£®e£>. (1.54) 



Then (1.53) becomes 



* = f 1 '" 2 ej ) ®ej (1.55) 



Entanglement: Suppose that has basis e^, . . . , e$ and 7^ has basis 
\ . . . , ei 2 The Hilbert space H^®H^ is then nm dimensional. Consider 
a state 

* = i/,v e V> ® e f G <g> ft (2) . (1.56) 



If we can find vectors 



$ = ^e^ G 

X = X j ef eH {2 \ (1.57) 

such that 

* = $ <g> X = ^V'e^ <8> ej 2) (1.58) 

then the tensor ^ is said to be decomposable and the two quantum systems 
are said to be unentangled. If there are no such vectors then the two systems 
are entangled in the sense of the Einstein- Podolski- Rosen (EPR) paradox. 

Quantum states are really in one-to-one correspondence with rays in the 
Hilbert space, rather than vectors. If we denote the n dimensional vector 
space over the field of the complex numbers as C n , the space of rays, in which 
we do not distinguish between the vectors x and Ax when A ^ 0, is denoted 
by CP n_1 and is called complex projective space. Complex projective space is 
where algebraic geometry is studied. The set of decomposable states may be 
thought of as a subset of the complex projective space CP nm_1 , and, since, 
as the following excercise shows, this subset is defined by a finite number of 
homogeneous polynomial equations, it forms what algebraic geometers call a 
variety. This particular subset is known as the Segre variety. 
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Exercise 1.3: The Segre conditions for a state to be decomposable: 

i) By counting the number of independent components that are at our dis- 
posal in and comparing that number with the number of free param- 
eters in 3> ® X, show that the coefficients ip % i must satisfy (n — l)(m— 1) 
relations if the state is to be decomposable. 

ii) If the state is decomposable, show that 

^kj ^kl 

for all sets of indices i,j,k,l. 

iii) Assume that ip 11 is not zero. Using your count from part (i) as a guide, 
find a subset of the relations from part (ii) that constitute a necessary and 
sufficient set of conditions for the state VP to be decomposable. Include 
a proof that your set is indeed sufficient. 

1.2.4 Symmetric and skew-symmetric tensors 

By examining the transformation rule you may see that if a pair of up- 
stairs or downstairs indices is symmetric (say Q^ v ' = Q utl ) or skew- 
symmetric (Q^ u paT = —Q Ufl paT ) in one basis, it remains so after the basis 
has been changed. (This is not true of a pair composed of one upstairs 
and one downstairs index.) It makes sense, therefore, to define symmetric 
and skew-symmetric tensor product spaces. Thus skew-symmetric doubly- 
contravariant tensors can be regarded as belonging to the space denoted by 
/\ 2 V and expanded as 

A=^e (1 Ae Vl (1.59) 

where the coefficients are skew-symmetric, A^ v = —A Ufi , and the wedge prod- 
uct of the basis elements is associative and distributive, as is the tensor 
product, but in addition obeys e M A e u = —e u A e p . The "1/2" (replaced 
by l/p\ when there are p indices) is convenient in that each independent 
component only appears once in the sum. For example, in three dimensions, 

\j^ v e p Ae u = A 12 e x A e 2 + A 23 e 2 A e 3 + A 31 e 3 A e v (1.60) 

Symmetric doubly-contravariant tensors can be regarded as belonging to 
the space sym 2 V and expanded as 



S = S aP e a e p 



(1.61) 
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where e a = e a and S af3 = S^ a . (We do not insert a "1/2" here 
because including it leads to no particular simplification in any consequent 
equations.) 

We can treat these symmetric and skew-symmetric products as symmetric 
or skew multilinear forms. Define, for example, 

e* a A e*? (e„, e„) = fi°tf - £*J, (1-62) 

and 

e«Ae^(e M Ae„)=^-^. (1.63) 

We need two terms on the right- hand- side of these examples because the 
skew-symmetry of e* a A e* l3 ( , ) in its slots does not allow us the luxury 
of demanding that the e M be inserted in the exact order of the e* a to get a 
non-zero answer. Because the p-th order analogue of (1.62) form has p\ terms 
on its right-hand side, some authors like to divide the right-hand-side by p\ 
in this definition. We prefer the one above, though. With our definition, and 
with A = \A^ u e*^ A e* u and B = \B ap e a Ae^we have 

A(B) = = ^V^, (1.64) 

so the sum is only over independent terms. 

The wedge (A) product notation is standard in mathematics wherever 
skew-symmetry is implied. 3 The "sym" and are not. Different authors use 
different notations for spaces of symmetric tensors. This reflects the fact that 
skew-symmetric tensors are extremely useful and appear in many different 
parts of mathematics, while symmetric ones have fewer special properties 
(although they are common in physics). Compare the relative usefulness of 
determinants and permanents. 

Exercise 1.4: Show that in d dimensions: 

i) the dimension of the space of skew-symmetric covariant tensors with p 
indices is d\/p\(d — p)\; 

ii) the dimension of the space of symmetric covariant tensors with p indices 
is (d + p- l)!/p!(d- 1)!. 

3 Skew products, along with the first formulation of the idea of an abstract vector 
space, were introduced in Hermann Grassmann's Ausdehnungslehre (1844). Grassmann's 
mathematics was not appreciated in his lifetime. In his disappointment he turned to other 
fields, making significant contributions to the theory of colour mixtures (Grassmann's 
law), and to the philology of Indo-European languages (another Grassmann's law). 
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Bosons and fermions 



Spaces of symmetric and skew-symmetric tensors appear whenever we deal 
with the quantum mechanics of many indistinguishable particles possessing 
Bose or Fermi statistics. If we have a Hilbert space 7i of single-particle states 
with basis then the iV-boson space is Sym N H which consists of states 



$ = $ 



IH2---IN 



e h &e i2 & ei 



and the iV-fermion space is /\ N 7~t, which contains states 

1 



Ae 



IN- 



The symmetry of the Bose wavefunction 

Qii...i a ...ig...i N _ Qi2...i/3~.ia.--iN 

and the skew-symmetry of the Fermion wavefunction 

^fil...i a ...i/j...i N _ ■qji2...i/3...ia---iN 



(1.65) 



1.66) 



1.67) 



1.68) 



under the interchange of the particle labels a, (3 is then natural. 

Slater Determinants and the Plucker Relations: Some iV-fermion states can 

be decomposed into a product of single-particle states 



?/>i A -02 A • • • A -0 



A? 



= M---^ e!l Ae 42 A 



A e 



1.69) 



Comparing the coefficients of e h A e i2 A ■ ■ ■ A e iN in (1.66) and (1.69) shows 
that the many-body wavefunction can then be written as 



lll 2 ...ljv 



''2 



4 1 4* 



IN 



01 

4 N 



1.70) 



The wavefunction is therefore given by a single Slater determinant. Such 
wavefunctions correspond to a very special class of states. The general 
many-fermion state is not decomposable, and its wavefunction can only be 
expressed as a sum of many Slater determinants. The Hartree-Fock method 
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of quantum chemistry is a variational approximation that takes such a single 
Slater determinant as its trial wavefunction and varies only the one-particle 
wavefunctions {i\ip a ) = Va- ^ lii a remarkably successful approximation, 
given the very restricted class of wavefunctions it explores. 

As with the Segre condition for two distinguishable quantum systems to 
be unentangled, there is a set of necessary and sufficient conditions on the 
q,w2... in f or the state ^ to be decomposable into single-particle states. The 
conditions are that 

^ f «ii2---«jv-ib'i^fJiJ2---jjv+i] _ q (1-71) 

for any choice of indices ii, . . . In-i and j±, . . . , Jn+i- The square brackets 
[. . .] indicate that the expression is to be antisymmetrized over the indices 
enclosed in the brackets. For example, a three-particle state is decomposable 
if and only if 

■qjni"23i-qjj2hji _ ^>«i«2i2^jii3j4 _|_ ^hnjz-^hhh _ ^hiiji^jijijz _ g ^ 72) 

These conditions are called the Plucker relations after Julius Plucker who 
discovered them long before before the advent of quantum mechanics. 4 It is 
easy to show that Pliicker's relations are necessary conditions for decompos- 
ability. It takes more sophistication to show that they are sufficient. We will 
therefore defer this task to the exercises as the end of the chapter. As far as 
we are aware, the Plucker relations are not exploited by quantum chemists, 
but, in disguise as the Hirota bilinear equations, they constitute the geometric 
condition underpinning the many-soliton solutions of the Korteweg-de-Vries 
and other soliton equations. 



1.2.5 Kronecker and Levi-Civita tensors 

Suppose the tensor 5% is defined, with respect to some basis, to be unity if 
fi = v and zero otherwise. In a new basis it will transform to 

8"t = a^{a- l m = a^a- 1 ); = 6£. (1.73) 

In other words the Kronecker delta symbol of type (1, 1) has the same numer- 
ical components in all co-ordinate systems. This is not true of the Kroneker 
delta symbol of type (0,2), i.e. of 5^ v . 

4 As well as his extensive work in algebraic geometry, Plucker (1801-68) made important 
discoveries in experimental physics. He was, for example, the first person to observe the 
deflection of cathode rays — beams of electrons — by a magnetic field, and the first to 
point out that each element had its characteristic emission spectrum. 
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Now consider an n-dimensional space with a tensor 77 M1/X2 .../i„ whose com- 
ponents, in some basis, coincides with the Levi-Civita symbol e mM2 ...^ n . We 
find that in a new frame the components are 



= e„ 



11111,2. ..fin ( a )l 1 ( Ct ' ' ' ( fl ) n ne ^1^2---^n 



= ^...^detA" 1 . (1.74) 

Thus, unlike the 6%, the Levi-Civita symbol is not quite a tensor. 
Consider also the quantity 



y/g d = f ^/det^]. (1.75) 

Here we assume that the metric is positive-definite, so that the square root 
is real, and that we have taken the positive square root. Since 

det [^y = det [(O^cT 1 )^] = (det A)~ 2 det [^], (1.76) 

we see that 

v ^=|detA|- 1 v ^ (1-77) 

Thus is also not quite an invariant. This is only to be expected, because 
g( , ) is a quadratic form and we know that there is no basis-independent 
meaning to the determinant of such an object. 
Now define 

and assume that £ WAt2 ...^ n has the type (0, n) tensor character implied by 
its indices. When we look at how this transforms, and restrict ourselves 
to orientation preserving changes of of bases, i.e. ones for which det A is 
positive, we see that factors of det A conspire to give 



A similar exercise indictes that if we define e'*^-'" to be numerically equal 
to e hi2 ..^ n then 

^11^2 ■■■fin _ \ c MlM2---Mn gQ\ 
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also transforms as a tensor — in this case a type (n, 0) contravariant one 
- provided that the factor of 1/ ^fg is always calculated with respect to the 
current basis. 

If the dimension n is even and we are given a skew-symetric tensor F^ v , 
we can therefore construct an invariant 

HlH2—Hn P . . . TP — _ r Ml^2---Mn T? . . . T? (181^ 

Similarly, given an skew-symmetric covariant tensor F Ml -/im with m (< n) 
indices we can form its dita/, denoted by F*, a in — m) -contravariant tensor 
with components 

1 11 

( F*\/i m -i.../in _ ^1^2---A*n T? — e (Hf2-M" P (I QO) 

ml v^ m ' 
We meet this "dual" tensor again, when we study differential forms. 



1.3 Cartesian Tensors 

If we restrict ourselves to Cartesian co-ordinate systems having orthonormal 
basis vectors, so that g^ = 5ij, then there are considerable simplifications. 
In particular, we do not have to make a distinction between co- and contra- 
variant indices. We shall usually write their indices as roman-alphabet suf- 
fixes. 

A change of basis from one orthogonal n-dimensional basis to another 
e ■ will set 

e| = OijBj, (1.83) 

where the numbers are the entries in an orthogonal matrix O, i.e. a real 
matrix obeying O t O = OO = I, where T denotes the transpose. The set 
of n-by-n orthogonal matrices constitutes the orthogonal group 0{n). 

1.3.1 Isotropic tensors 

The Kronecker 5ij with both indices downstairs is unchanged by O(n) trans- 
formations, 

= O lk 3l b kl = O lk 3k = O lk O T kj = 6^, (1.84) 
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and has the same components in any Cartesian frame. We say that its 
components are numerically invariant. A similar property holds for tensors 
made up of products of Sij, such as 

Tijklmn $ij$kl$mn 

It is possible to show 5 that any tensor whose components are numerically 
invariant under all orthogonal transformations is a sum of products of this 
form. The most general 0(n) invariant tensor of rank four is, for example. 

aSijSki + (35ik5ij + ^5ii5j k . (1.86) 

The determinant of an orthogonal transformation must be ±1. If we only 
allow orientation-preserving changes of basis then we restrict ourselves to 
orthogonal transformations O^- with det O = 1. These are the proper or- 
thogonal transformations. In n dimensions they constitute the group SO(n). 
Under SO(n) transformations, both 8^ and e ili2 __, in are numerically invariant 
and the most general SO(n) invariant tensors consist of sums of products of 
5i/s and ej 1 j 2 ...j n 's. The most general SO(4)-invariant rank- four tensor is, for 
example, 

a5ij5ki + P5 ik 5ij + ^5ii5j k + \e ijk i. (1-87) 

Tensors that are numerically invariant under SO(n) are known as isotropic 
tensors. 

As there is no longer any distinction between co- and contravariant in- 
dices, we can now contract any pair of indices. In three dimensions, for 
example, 

Bijkl = tnijtnkl (1.88) 

is a rank- four isotropic tensor. Now e^...^ is not invariant when we transform 
via an orthogonal transformation with det O = — 1, but the product of two 
e's is invariant under such transformations. The tensor B^i is therefore 
numerically invariant under the larger group 0(3) and must be expressible 

as 

Bi jk i = aSij5 k i + (35 ik 5ij + -f5 u 5 jk (1.89) 

for some coefficients a, f3 and 7. The following exercise explores some con- 
sequences of this and related facts. 

5 The proof is surprisingly complicated. See, for example, M. Spivak, A Comprehensive 
Introduction to Differential Geometry (second edition) Vol. V, pp. 466-481. 



(1.85) 
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Exercise 1.5: We defined the n-dimensional Levi-Civita symbol by requiring 
that e^jj...^ be antisymmetric in all pairs of indices, and ei2... n = 1- 

a) Show that e i2 3 = £231 = £312, but that 61234 = -£2341 = £3412 = -£4123- 

b) Show that 

^ijk e i'j'k' = fin'fijj'fikk' + nve other terms, 

where you should write out all six terms explicitly. 

c) Show that eij^ij'k 1 = fijj'^kk' — &jk'&kj'- 

d) For dimension n = 4, write out ^ijki^ij'k'V as a sum °f products of S's 
similar to the one in part (c). 

Exercise 1.6: Vector Products. The vector product of two three- vectors may 
be written in Cartesian components as (a x b)j = eijkdjbk- Use this and your 
results about from the previous exercise to show that 

i) a • (b x c) = b • (c x a) = c • (a x b), 

ii) a x (b x c) = (a • c)b — (a • b)c, 

hi) (a x b) • (c x d) = (a • c)(b • d) - (a • d)(b • c). 

iv) If we take a, b, c and d, with d = b, to be unit vectors, show that 
the identities (i) and (hi) become the sine and cosine rule, respectively, 
of spherical trigonometry. (Hint: for the spherical sine rule, begin by 
showing that a • [(a x b) x (a x c)] = a • (b x c).) 



1.3.2 Stress and strain 

As an illustration of the utility of Cartesian tensors, we consider their appli- 
cation to elasticity. 

Suppose that an elastic body is slightly deformed so that the particle that 
was originally at the point with Cartesian co-ordinates Xi is moved to Xi + r/i. 
We define the (infinitesimal) strain tensor by 

It is automatically symmetric: = e^. We will leave for later (exercise 
2.3) a discussion of why this is the natural definition of strain, and also 
the modifications necessary were we to employ a non-Cartesian co-ordinate 
system. 

To define the stress tensor we consider the portion Q of the body in 
figure 1.1, and an element of area dS = nd\S\ on its boundary. Here, n is 
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the unit normal vector pointing out of f2. The force F exerted on this surface 
element by the parts of the body exterior to Q has components 

Fi = (TijTij d\S\. (1.91) 




Figure 1.1: Stress forces. 



That F is a linear function of nc?|S'| can be seen by considering the forces 
on an small tetrahedron, three of whose sides coincide with the co-ordinate 
planes, the fourth side having n as its normal. In the limit that the lengths 
of the sides go to zero as e, the mass of the body scales to zero as e 3 , but 
the forces are proprtional to the areas of the sides and go to zero only as e 2 . 
Only if the linear relation holds true can the acceleration of the tetrahedron 
remain finite. A similar argument applied to torques and the moment of 
intertia of a small cube shows that o~ij = o~ji. 
A generalization of Hooke's law, 

o'ij = Cijkieki, (1.92) 

relates the stress to the strain via the tensor of elastic constants cyjy. This 
rank-four tensor has the symmetry properties 

C-ijkl Cklij Cjifrf Cjj'ifc' (1.93) 

In other words, the tensor is symmetric under the interchange of the first 
and second pairs of indices, and also under the interchange of the individual 
indices in either pair. 

For an isotropic material — a material whose properties are invariant 
under the rotation group SO (3) — the tensor of elastic constants must be an 
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isotropic tensor. The most general such tensor with the required symmetries 
is 

Cijki = ^ijhi + n(8ik8ji + Su5 jk ). (1.94) 

As isotropic material is therefore characterized by only two independent pa- 
rameters, A and fi. These are called the Lame constants after the mathemat- 
ical engineer Gabriel Lame. In terms of them the generalized Hooke's law 
becomes 

Oij = \5ije kk + 2/j,eij. (1.95) 

By considering particular deformations, we can express the more directly 
measurable bulk modulus, shear modulus, Young's modulus and Poisson's 
ratio in terms of A and /i. 

The bulk modulus k is defined by 

dV 

— = - K dP, (1.96) 

where an infinitesimal isotropic external pressure dP causes a change V — > 
V + dV in the volume of the material. This applied pressure corresponds to 
a surface stress of = —5ij dP. An isotropic expansion displaces points in 
the material so that 

1 d V 

The strains are therefore given by 

= \ s *if- (1-98) 

Inserting this strain into the stress-strain relation gives 

Oij = S i:j (X + \»)^r = -5 tj dP. (1.99) 

Thus 

k = \+ -/i. (1.100) 

To define the shear modulus, we assume a deformation rji = 8x2, so 
ei2 = e2i = 6/2, with all other vanishing. 
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Figure 1.2: Shear strain. The arrows show the direction of the applied 
stresses. The <t 2 i on the vertical faces are necessary to stop the body ro- 
tating. 



The applied shear stress is a 12 = o"2i- The shear modulus, is defined to be 
(Ju/O- Inserting the strain components into the stress- strain relation gives 

a l2 = 1*9, (1.101) 

and so the shear modulus is equal to the Lame constant /i. We can therefore 
write the generalized Hooke's law as 

o-jj = 2fj,(eij - \5ijCkk) + ne k kSij, (1.102) 

which reveals that the shear modulus is associated with the traceless part of 
the strain tensor, and the bulk modulus with the trace. 

Young's modulus Y is measured by stretching a wire of initial length L 
and square cross section of side W under a tension T = a 33 W 2 . 



L 





1— 






w 











Figure 1.3: Forces on a stretched wire. 



We define Y so that 

dL 

a 33 = Y-. (1.103) 

At the same time as the wire stretches, its width changes W — > W + dW. 
Poisson's ratio a is defined by 

dW dL , 

= -a — , 1.104 

W L v ; 
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so that a is positive if the wire gets thinner as it gets longer. The displace- 
ments are 



dh\ 

dW\ fdL 



m = z 



i dw \ ( dL \ 

= yl w ) = -°y{ T )> (i.io5) 



so the strain components are 



dL dW . 

e 3 3 = en = e 22 = = -cre 33 . (1.106) 

We therefore have 

.733 = (A(l - 2a) + 2n) (j^J , (1.107) 

leading to 

Y = A(l - 2a) + 2//. (1.108) 
Now, the side of the wire is a free surface with no forces acting on it, so 

= a 22 = <7 U = (A(l - 2a) - 2a/i) (^j . (1.109) 
This tells us that 6 

(L110) 

and 

Other relations, following from those above, are 

Y = 3«(1-2<t), 

= 2^(1 + <t). (1-112) 



6 Poisson and Cauchy believed that A = (i, and hence that a = 1/4. 
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Exercise 1 . 7: Show that the symmetries 



C-ijkl — Cklij — Cjikl — Cjj'ifc 



imply that a general homogeneous material has 21 independent elastic con- 
stants. (This result was originally obtained by George Green, of Green func- 
tion fame.) 

Exercise 1.8: A steel beam is forged so that its cross section has the shape of 
a region T G M 2 . When undeformed, it lies along the z axis. The centroid O 
of each cross section is defined so that 



when the co-ordinates x, y are taken with the centroid O as the origin. The 
beam is slightly bent away from the z axis so that the line of centroids remains 
in the y, z plane. At a particular cross section with centroid O, the line of 
centroids has radius of curvature R. 



x dxdy 



y dxdy = 0, 




O 

r 



z 



X 



Figure 1.4: Bent beam. 



Assume that the deformation in the vicinity of O is such that 



(7 




Hx 2 -y 2 )-z 2 }, 
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y 















Figure 1.5: The original (dashed) and anticlastically deformed (full) cross- 
section. 



For positive Poisson ratio, the cross section deforms anticlastically — the sides 
bend up as the beam bends down. 

Compute the strain tensor resulting from the given deformation, and show 
that its only non-zero components are 

a a 1 

e ra = — jj>2/' e yy = ~ ^ zz = 7?^' 

Next, show that 

o-zz = ( "FT J 2/> 



R) 

and that all other components of the stress tensor vanish. Deduce from this 
vanishing that the assumed deformation satisfies the free-surface boundary 
condition, and so is indeed the way the beam responds when it is bent by 
forces applied at its ends. 

The work done in bending the beam 



I ^3 

r .&ijCijkl&kl u> ^ 
beam ^ 



is stored as elastic energy. Show that for our bent rod this energy is equal to 



where s is the arc-length taken along the line of centroids of the beam, 

I = [ y 2 alxdy 



is the moment of inertia of the region V about the x axis, and y" denotes 
the second derivative of the deflection of the beam with respect to z (which 
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approximates the arc- length) . This last formula for the strain energy has been 
used in a number of our calculus-of-variations problems. 





y 




^~ 


" 2 





Figure 1.6: The distribution of forces a zz exerted on the left-hand part of the 
bent rod by the material to its right. 

1.3.3 Maxwell stress tensor 

Consider a small cubical element of an elastic body. If the stress tensor were 
position independent, the external forces on each pair of opposing faces of 
the cube would be equal in magnitude but pointing in opposite directions. 
There would therefore be no net external force on the cube. When is not 
constant then we claim that the total force acting on an infinitesimal element 
of volume dV is 

Ft = djVij dV. (1.113) 

To see that this assertion is correct, consider a finite region Q with boundary 
dQ, and use the divergence theorem to write the total force on Q as 

F* ot = I a ijnj d\S\ = [ djOijdV. (1-H4) 
Jan Jq 

Whenever the force-per-unit-volume ft acting on a body can be written 
in the form ft = djUij, we refer to as a "stress tensor," by analogy with 
stress in an elastic solid. As an example, let E and B be electric and magnetic 
fields. For simplicity, initially assume them to be static. The force per unit 
volume exerted by these fields on a distribution of charge p and current j is 

f = pE+jxB. (1.115) 

From Gauss' law p = divD, and with D = eoE, we find that the force per 
unit volume due the electric field has components 

pEi = (djD^Ei = eofdjiEiEj) - E j djE^j 
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= ^{e^-^E^. (1.116) 

Here, in passing from the first line to the second, we have used the fact that 
curlE is zero for static fields, and so djEi = diEj. Similarly, using j = curlH, 
together with B = /i H and div B = 0, we find that the force per unit volume 
due the magnetic field has components 

(j x B)i = (HiHj - ^|#| 2 ) . (1.117) 



The quantity 



= eo (EiEj - ^\E\^j +f i (HiHj - ^\H\ 2 ^j (1.118) 



is called the Maxwell stress tensor. Its utility lies in in the fact that the 
total electromagnetic force on an isolated body is the integral of the Maxwell 
stress over its surface. We do not need to know the fields within the body. 

Michael Faraday was the first to intuit a picture of electromagnetic stresses 
and attributed both a longitudinal tension and a mutual lateral repulsion to 
the field lines. Maxwell's tensor expresses this idea mathematically. 

Exercise 1.9: Allow the fields in the preceding calculation to be time depen- 
dent. Show that Maxwell's equations 

<9B 

curlE = — — , divB = 0, 

at 

<9D 

curlH=j + — -, divD = p, 

at 

with B = /ioH, D = eoE, and c = l/y / 7ioeo, lead to 

d f 1 



(pE + j x B)i + — |^(E x H)i j = djo.j. 

The left-hand side is the time rate of change of the mechanical (first term) 
and electromagnetic (second term) momentum density. Observe that we can 
equivalently write 



d_ 

dt 



|^(E x H)i j + dji-aij) = -{pE + j x B) 
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and think of this a local field-momentum conservation law. In this interpre- 
tation —<Jij is thought of as the momentum flux tensor, its entries being the 
flux in direction j of the component of field momentum in direction i. The 
term on the right-hand side is the rate at which momentum is being supplied 
to the electro-magnetic field by the charges and currents. 

1.4 Further Exercises and Problems 

Exercise 1.10: Quotient theorem. Suppose that you have come up with some 
recipe for generating an array of numbers T l i k in any co-ordinate frame, and 
want to know whether these numbers are the components of a triply con- 
travariant tensor. Suppose further that you know that, given the components 
dij of an arbitrary doubly covariant tensor, the numbers 

T^ k a jk = v l 

transform as the components of a contravariant vector. Show that T ljk does 
indeed transform as a triply contravariant tensor. (The natural generalization 
of this result to arbitrary tensor types is known as the quotient theorem.) 

Exercise 1.11: Let T % j be the 3-by-3 array of components of a tensor. Show 
that the quantities 

a = T i i , b = T i j Ti i , c = T 3 ThT k t 

are invariant. Further show that the eigenvalues of the linear map represented 
by the matrix T l j can be found by solving the cubic equation 

A 3 - a\ 2 + \{a 2 - b)\ - ha 3 - Sab + 2c) = 0. 

Exercise 1.12: Let the covariant tensor Rijki possess the following symme- 
tries: 

i) Rijkl = Rjikli 
h) Rijkl = -Rijlk, 
hi) Rijkl + Riklj + Riljk = 0. 

Use the properties i),ii), iii) to show that: 

a ) Rijkl = Rklij- 

b) If Rijkix l y^x k y l = for all vectors x\ y l , then Rijki = 0. 
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c) If Bij is a symmetric covariant tensor and set we A^i = B^Bji — BuBj^, 
then Aijki has the same symmetries as Rijki- 

Exercise 1.13: Write out Euler's equation for fluid motion 

v + (v • V)v = -Vh 
in Cartesian tensor notation. Transform it into 

v-vxu = -V ( ^v 2 + h j , 



where cj = V x v is the vorticity. Deduce Bernoulli's theorem, that for steady 



(v = 0) flow the quantity iv 2 + h is constant along streamlines. 



Exercise 1.14: Symmetric integration. Show that the n-dimensional integral 

/d n k 
——(kakpk^ks) f{k 2 ), 



is equal to 
where 

Similarly evaluate 



A(5 a p5 1 s + 5 aj 5p5 + 6 a s5i3y) 



/d n k 
j—^ikakpkyksh) f(k 2 ). 

Exercise 1.15: Write down the most general three-dimensional isotropic ten- 
sors of rank two and three. 

In piezoelectric materials, the application of an electric field Ei induces a 
mechanical strain that is described by a rank-two symmetric tensor 

eij = dijkEfc, 

where dijk is a third-rank tensor that depends only on the material. Show 
that can only be non-zero in an anisotropic material. 
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Exercise 1.16: In three dimensions, a rank-five isotropic tensor Tj^m is a 
linear combination of expressions of the form e*i*2*3<5* 4 * 6 for some assignment 
of the indices i,j, k, I, m to the i\, . . . , Show that, on taking into account 
the symmetries of the Kronecker and Levi-Civita symbols, we can construct 
ten distinct products e*i* 2 * 3 <5*4* 5 - Only six of these are linearly independent, 
however. Show, for example, that 



and find the three other independent relations of this sort. 7 
(Hint: Begin by showing that, in three dimensions, 



c*l*2*3*4 
15*6*7*8 



dcf 



(5*115 


<5*l*6 


0*1*7 


5*1*8 


"*2*5 


5*2*6 


5* 2 * 7 


5*2*8 


5*3*5 


<5*3*6 


5*3*7 


5* 3 * 8 


5*4*5 


5*4*6 


5*4* 7 


5*4*8 



o, 



and contract with e 



*6*7*8 







Problem 1.17: The Pliicker Relations. This problem provides a challenging 
test of your understanding of linear algebra. It leads you through the task of 
deriving the necessary and sufficient conditions for 



to be decomposable as 



A 1 ^ e h A...Ae lk e/\ k V 



A = fi A f 2 A . . . A f fc . 



The trick is to introduce two subspaces of V, 

i) W, the smallest subspace of V such that A G f\ k W, 

ii) W = {vGV:vAA = 0}, 

and explore their relationship, 
a) Show that if {wi, W2, . . . , w n } constitute a basis for W', then 

A = wi A W2 A • • • A w n A ip 

for some ip G f\ k ~ n V. Conclude that that W' C W, and that equal- 
ity holds if and only if A is decomposable, in which case W = W = 
span{fi ...ffc}. 



7 Such relations are called syzygies. A recipe for constructing linearly independent basis 
sets of isotropic tensors can be found in: G. F. Smith, Tensor, 19 (1968) 79-88. 
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b) Now show that W is the image space of ^* under the map that 
takes 

H = Ei^i^e* 1 A ... A e«*-i G /\ k ~ V* 

to 

i(B)A^H il ... ifc _ 1 ^- i *-^e j €F 
Deduce that the condition C W' is that 

(z(E)a)aA = 0, V3G/\ fc 'V*. 

c) By taking 

3 = e" 1 A ... A e***- 1 , 
show that the condition in part b) can be written as 

A il-ik-lh A j2j 3 ...j k +l eh A . . . A ejk + i = 0. 

Deduce that the necessary and sufficient conditions for decomposibility 
are that 

£ii—ik-i\ji £3233— jk+i] — 

for all possible index sets i\, . . . , ik-i,ji, ■ ■ ■ jk+i- Here [. . .] denotes anti- 
symmetrization of the enclosed indices. 



Chapter 2 

Differential Calculus on 
Manifolds 

In this section we will apply what we have learned about vectors and tensors 
in a linear space to the case of vector and tensor fields in a general curvilinear 
co-ordinate system. Our aim is to introduce the reader to the modern lan- 
guage of advanced calculus, and in particular to the calculus of differential 
forms on surfaces and manifolds. 



2.1 Vector and Covector Fields 

Vector fields — electric, magnetic, velocity fields, and so on — appear every- 
where in physics. After perhaps struggling with it in introductory courses, we 
rather take the field concept for granted. There remain subtleties, however. 
Consider an electric field. It makes sense to add two field vectors at a single 
point, but there is no physical meaning to the sum of field vectors E(xi) and 
E(rr 2 ) at two distinct points. We should therefore regard all possible electric 
fields at a single point as living in a vector space, but each different point 
in space comes with its own field-vector space. This view seems even more 
reasonable when we consider velocity vectors describing motion on a curved 
surface. 

A velocity vector lives in the tangent space to the surface at each point, 
and each of these spaces is a differently oriented subspace of the higher- 
dimensional ambient space. 
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Figure 2.1: Each point on a surface has its own vector space of tangents. 

Mathematicians call such a collection of vector spaces — one for each of 
the points in a surface — a vector bundle over the surface. Thus the tangent 
bundle over a surface is the totality of all vector spaces tangent to the surface. 
Why a bundle ? This word is used because the individual tangent spaces are 
not completely independent, but are tied together in a rather non-obvious 
way. Try to construct a smooth field of unit vectors tangent to the surface 
of a sphere. However hard you work you will end up in trouble somewhere. 
You cannot comb a hairy ball. On the surface of torus you will have no 
problems. You can comb a hairy doughnut. The tangent spaces collectively 
know something about the surface they are tangent to. 

Although we spoke in the previous paragraph of vectors tangent to a 
curved surface, it is useful to generalize this idea to vectors lying in the 
tangent space of an n-dimensional manifold. An n-manifold M is essentially 
a space that locally looks like a part of M. n . This means that some open 
neighbourhood of each point can be parametrized by an n-dimensional co- 
ordinate system. Such a parametrization is called a chart. Unless M is M. n 
itself (or part of it), a chart will cover only part of M, and more than one 
will be required for complete coverage. Where a pair of charts overlap we 
demand that the transformation formula giving one set of co-ordinates as a 
function of the other be a smooth (C°°) function, and to possess a smooth 
inverse. 1 A collection of such smoothly related co-ordinate charts covering 
all of M is called an atlas. The advantage of thinking in terms of manifolds 
is that we do not have to understand their properties as arising from some 
embedding in a higher dimensional space. Whatever structure they have, 
they possess in, and of, themselves 

*A formal definition of a manifold contains some further technical restrictions (that the 
space be Hausdorff and paracompact) that are designed to eliminate pathologies. We are 
more interested in doing calculus than in proving theorems, and so we will ignore these 
niceties. 
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Classical mechanics provides a familiar illustration of these ideas. The 
configuration space M of a mechanical system is usually a manifold. When 
the system has n degrees of freedom we use generalized co-ordinates q l , i — 
l,...,n to parameterize M. The tangent bundle of M then provides the 
setting for Lagrangian mechanics. This bundle, denoted by TM, is the 2n- 
dimensional space whose points consist of a point p in M together with a 
tangent vector lying in the tangent space TM P at that point. If we think 
of the tangent vector as a velocity, the natural co-ordinates on TM become 
(g 1 , q 2 , . . . , q n ; q 1 , q 2 , . . . , q n ), and these are the variables that appear in the 
Lagrangian of the system. 

If we consider a vector tangent to some curved surface, it will stick out 
of it. If we have a vector tangent to a manifold, it is a straight arrow lying 
atop bent co-ordinates. Should we restrict the length of the vector so that 
it does not stick out too far? Are we restricted to only infinitesimal vectors? 
It's best to avoid all this by inventing a clever notion of what a vector in 
a tangent space is. The idea is to focus on a well-defined object such as 
a derivative. Suppose our space has co-ordinates x^ (These are not the 
contravariant components of some vector). A directional derivative is an 
object such as X^d^ where <9 M is shorthand for d/dx^. When the numbers 

are functions of the co-ordinates x a , this object is called a tangent- vector 
field, and we write 2 

X = X»dp. (2.1) 

We regard the <9 M at a point a; as a basis for TM X , the tangent- vector space at 
x, and the X^(x) as the (contravariant) components of the vector X at that 
point. Although they are not little arrows, what the are is mathematically 
clear, and so we know perfectly well how to deal with them. 

When we change co-ordinate system from x M to z v by regarding the x^s 
as invertable functions of the z v % i.e. 

x 1 = x 1 (z 1 ,z 2 , 

2 2/12 

X = X [z , z , 

x n = x n (z\z 2 ,...,z n ), (2.2) 

2 We are going to stop using bold symbols to distinguish between intrinsic objects and 
their components, because from now on almost everything will be something other than a 
number, and too much black ink would just be confusing. 
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then the chain rule for partial differentiation gives 

d dz u d ( dz 



dx» 1 VJ 



where d' u is shorthand for d/dz v . By demanding that 

X = X^ = X w & v 
we find the components in the z v co-ordinate frame to be 

dz v ' 



X 



in 



Conversely, using 



we have 



dx a dz u dx°~ 



X^ 



dz v dx^ dx 
' dx v 



dz^ 



(2.3) 
(2.4) 

(2.5) 

(2.6) 
(2.7) 



This, then, is the transformation law for a contravariant vector. 

It is worth pointing out that the basis vectors are not unit vectors. As 
we have no metric, and therefore no notion of length anyway, we cannot try 
to normalize them. If you insist on drawing (small?) arrows, think of d\ as 



starting at a point (x l ,x 2 



and with its head at (x 1 + 1, x 2 , 



, x 



Of course this is only a good picture if the co-ordinates are not too "curvy." 

x l =3 



x}=4 




x 2 =6 



x l =5 



x 2 =4 



Figure 2.2: Approximate picture of the vectors d\ and di at the point 
(x\* 2 ) = (2,4). 



Example: The surface of the unit sphere is a manifold. It is usually denoted 
by S 2 . We may label its points with spherical polar co-ordinates 9 and </>, 
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and these will be useful everywhere except at the north and south poles, 
where they become singular because at 9 = or 7r all values of <fi correspond 
to the same point. In this co-ordinate basis, the tangent vector representing 
the velocity field due to a rigid rotation of one radian per second about the 
z axis is 

V z = fy. (2.8) 

Similarly 

V x = — sin0 do — cot 9 cos0 d^, 

V y = cos(pde — cot # sin 0(9^, (2.9) 

represent rigid rotations about the x and y axes. 

We now know how to think about vectors. What about their dual-space 
partners, the covectors? These live in the cotangent bundle T*M, and for 
them a cute notational game, due to Elie Cartan, is played. We write the 
basis vectors dual to the <9 M as dx^( ). Thus 

dxr(d v ) = 5$. (2.10) 

When evaluated on a vector field X = X^d^, the basis covectors dx^ return 
its components 

dx^X) = dx»{X v d v ) = X v dx»{d v ) = X u 5» = X". (2.11) 

Now, any smooth function / G C°°(M) will give rise to a field of covectors 
in T*M. This is because a vector field X acts on the scalar function / as 

Xf = X»dJ (2.12) 

and X f is another scalar function. This new function gives a number — and 
thus an element of the field R — at each point x G M. But this is exactly 
what a covector does: it takes in a vector at a point and returns a number. 
We will call this covector field u df ." It is essentially the gradient of /. Thus 

df(X) d ^Xf = X^. (2.13) 



If we take / to be the co-ordinate x", we have 

, dx v 
dx^ 



dr u 

dx u (X) = X»— = X»5» = X u , (2.14) 
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so this viewpoint is consistent with our previous definition of dx v . Thus 

*W = H-X* = J^(X) (2.15) 
for any vector field X. In other words, we can expand df as 

df = |£dx". (2.16) 

This is not some approximation to a change in /, but is an exact expansion 
of the covector field df in terms of the basis covectors dx^. 

We may retain something of the notion that dx^ represents the (con- 
travariant) components of a small displacement in x provided that we think 
of dx 11 as a machine into which we insert the small displacement (a vector) 
and have it spit out the numerical components 5x^. This is the same dis- 
tinction that we make between sin( ) as a function into which one can plug 
x, and sin a;, the number that results from inserting in this particular value 
of x. Although seemingly innocent, we know that it is a distinction of great 
power. 

The change of co-ordinates transformation law for a covector field / M is 
found from 

f li dx^ = fidz v , (2.17) 

by using 

dX " = (2 ' 18) 



We find 

, dz v 

A general tensor such as Q XfJ/ p(TT transforms as 



Q z^ Q 

Q'\t(z) = ^dxTd^o^o^^^)- (2-20) 



Observe how the indices are wired up: Those for the new tensor coefficients 
in the new co-ordinates, z, are attached to the new z's, and those for the old 
coefficients are attached to the old x's. Upstairs indices go in the numerator 
of each partial derivative, and downstairs ones are in the denominator. 
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The language of bundles and sections 

At the beginning of this section, we introduced the notion of a vector bundle. 
This is a particular example of the more general concept of a fibre bundle, 
where the vector space at each point in the manifold is replaced by a "fibre" 
over that point. The fibre can be any mathematical object, such as a set, 
tensor space, or another manifold. Mathematicians visualize the bundle as 
a collection of fibres growing out of the manifold, much as stalks of wheat 
grow out the soil. When one slices through a patch of wheat with a scythe, 
the blade exposes a cross-section of the stalks. By analogy, a choice of an 
element of the the fibre over each point in the manifold is called a cross- 
section, or, more commonly, a section of the bundle. In this language a 
tangent-vector field becomes a section of the tangent bundle, and a field of 
covectors becomes a section of the cotangent bundle. 

We provide a more detailed account of bundles in chapter 7. 

2.2 Differentiating Tensors 

If / is a function then <9 M / are components of the covariant vector df. Suppose 
that d 1 is a contravariant vector. Are the components of a type (1,1) 
tensor? The answer is no\ In general, differentiating the components of a 
tensor does not give rise to another tensor. One can see why at two levels: 

a) Consider the transformation laws. They contain expressions of the form 
dx^/dz". If we differentiate both sides of the transformation law of a 
tensor, these factors are also differentiated, but tensor transformation 
laws never contain second derivatives, such as d 2 x^ jdz v dz u . 

b) Differentiation requires subtracting vectors or tensors at different points 

- but vectors at different points are in different vector spaces, so their 
difference is not defined. 
These two reasons are really one and the same. We need to be cleverer to 
get new tensors by differentiating old ones. 

2.2.1 Lie Bracket 

One way to proceed is to note that the vector field X is an operator. It makes 
sense, therefore, to try to compose two of them to make another. Look at 
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XY, for example: 

/ 3Y V \ 

XY = X»d„{Y»d v ) = X»Y»d% + X" — d v . (2.21) 

What are we to make of this? Not much! There is no particular interpretation 
for the second derivative, and as we saw above, it does not transform nicely. 
But suppose we take a commutator: 

[X, Y] = XY — YX = (X^d^Y u ) - Y^d^X u )) d v . (2.22) 

The second derivatives have cancelled, and what remains is a directional 
derivative and so a bona-fide vector field. The components 



[x, y\ v = x^(d^Y v ) - r M (<vn (2-23) 

are the components of a new contravariant vector field made from the two 
old vector fields. It is called the Lie bracket of the two fields, and has a 
geometric interpretation. 

To understand the geometry of the Lie bracket, we first define the flow 
associated with a tangent-vector field X. This is the map that takes a point 
x and maps it to x(t) by solving the family of equations 

— = X»(x\x 2 ,...,x d ), (2.24) 

with initial condition x M (0) = Xq. In words, we regard X as the velocity field 
of a flowing fluid, and let x ride along with the fluid. 

Now envisage X and Y as two velocity fields. Suppose we flow along X 
for a brief time t, then along Y for another brief interval s. Next we switch 
back to X, but with a minus sign, for time t, and then to —Y for a final 
interval of s. We have tried to retrace our path, but a short exercise with 
Taylor's theorem shows that we will fail to return to our exact starting point. 
We will miss by 5x^ = st[X, Y]^, plus corrections of cubic order in s and t. 
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Figure 2.3: The Lie bracket. 



Example: Let 

V x = — sin (p do — cot 6 cos (j> 8$ 
V y = cos do — cot 9 sin <9^ 

be two vector fields in T(S 2 ). We find that 

[V X ,Vy] = -V z , 

where V z — cL. 



Frobenius' Theorem 

Suppose that in some region of a <i-dimensional manifold M we are given 
n < d linearly independent tangent- vector fields Xj. Such a set is called a 
distribution by differential geometers. (The concept has nothing to do with 
probability, or with objects like u 8{x)" which are also called "distributions.") 
At each point x, the span (X^x)) of the field vectors vectors forms a subspace 
of the tangent space TM X , and we can picture this subspace as a fragment 
of an n-dimensional surface passing through x. It is possible that these 
surface fragments fit together to make a stack of smooth surfaces — called a 
foliation — that fill out the G?-dimensional space, and have the given as 
their tangent vectors. 
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Figure 2.4: A local foliation. 



If this is the case then starting from x and taking steps only along the Xi 
we find ourselves restricted to the n-surface, or n-submanifold, N passing 
though the original point x. 

Alternatively, the surface fragments may form such an incoherent jumble 
that starting from x and moving only along the Xi we can find our way to any 
point in the neighbourhood of x. It is also possible that some intermediate 
case applies, so that moving along the X t restricts us to an m-surface, where 
d > m > n. The Lie bracket provides us with the appropriate tool with 
which to investigate these possibilities. 

First a definition: If there are functions c i ^{x) such that 



i.e. the Lie brackets close within the set {X{\ at each point x, then the 
distribution is said to be involutive. When our given distribution is involutive, 
then the first case holds, and, at least locally, there is a foliation by n- 
submanifolds N. A formal statement of this is: 

Theorem (Frobenius): A smooth (C°°) involutive distribution is completely 
integrable: locally, there are co-ordinates x^,\i = l,...,d such that Xi = 
YT^=i Xidfj,, and the surfaces N through each point are in the form x^ = 
const, for fj, — n + 1, . . . , d. Conversely, if such co-ordinates exist then the 
distribution is involutive. 

Sketch of Proof: If such co-ordinates exist then it is obvious that the Lie 
bracket of any pair of vectors in the form Xi = Y^=i -^T ^ can a ^ so ^ e ex " 
panded in terms of the first n basis vectors. A logically equivalent statement 
exploits the geometric interpretation of the Lie bracket: If the Lie brackets 
of the fields Xi do not close within the n-dimensional span of the Xi, then a 
sequence of back-and-forth manouvres along the Xi allows us to escape into a 
new direction, and so the Xi cannot be tangent to an n-surface. Establishing 




(2.25) 
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the converse — that closure implies the existence of the foliation — is rather 
more technical, and we will not attempt it. 

The physicist's version of Frobenius' theorem is usually expressed in the 
language of holonomic or anholonomic constraints. 

For example, consider a particle moving in three dimensions. If we are 
told that the velocity vector is constrained to be perpendicular to the radius 
vector, i.e. v • r = 0, we realize that the particle is being forced to move on a 
the sphere |r| = r passing through the initial point. In spherical co-ordinates 
the associated distribution is the set {80,8^}, which is clearly involutive. 
The foliation is the family of nested spheres whose centre is the origin. The 
foliation is not global because it becomes singular at r = 0. Constraints like 
this, which restrict the motion to a surface, are called holonomic. 

Suppose, on the other hand, we have a ball rolling on a table. Here, we 
have a five- dimensional configuration manifold M = WL 2 x S 3 parameterized 
by the centre of mass (x, y) G M 2 of the ball and the three Euler angles 
(9, 0, ip) G S* 3 defining its orientation. Three no-slip rolling conditions 



(see exercise 2.17) link the rate of change of the Euler angles to the velocity 
of the centre of mass. At each point in this five-dimensional manifold we are 
free to roll the ball in two directions, and so might expect that the reachable 
configurations constitute a two-dimensional surface embedded in the full five- 
dimensional space. The two vector fields 



describing the x- and y-direction rolling motion are not in involution, how- 
ever. By calculating enough Lie brackets we eventually obtain five linearly 
independent velocity vector fields, and starting from one configuration we can 
reach any other. The no-slip rolling condition is said to be non-integrable, or 
anholonomic. Such systems are tricky to deal with in Lagrangian dynamics. 

For a (/-dimensional mechanical system, a set of m independent con- 
straints of the form cu l (q)^ = 0, i = 1, . . . , m determines an n = d — m 



x — ip sin 9 sin + 9 cos 0, 
y = —ip sin 9 cos + 9 sin 0, 
= ip cos 9 + 0, 



(2.26) 



roll x = 8 X — sin cot 9 8^ + cos 0(9,? + cosec# sin 05^,, 
roll y = 8 y + cos cot 6 1 (9^, + sin (f)8 e — cosec # cos <9</,, 



(2.27) 
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dimensional distribution. In terms of the vector q = q^d^ and the covectors 

d 

u* = ^2 ujl(q)dqv } i = l<i<m (2.28) 

we can write the these constraints as u) l (q) = 0. This is known a Pfaffian 
system of equations. The Pfaffian system is said to be integrable if the 
distribution it implicitly defines is in involution, and hence itself integrable. 
In this case there is a set of m functions g l (q) and an invertible m-by-m 
matrix f l j{q) such that 

m 

^ = Y,f l MW- (2-29) 

3=1 

The functions g l {q) can, for example, be taken to be the co-ordinate func- 
tions n — n + 1, . . . , d, that label the foliating surfaces iV in the state- 
ment of Frobenius' theorem. The system of integrable constraints w l (q) = 
thus restricts us to the surfaces g l (q) = constant. Integrable constraints are 
therefore holonomic. 

The following exercise provides a familiar example of the utility of non- 
holonomic constraints: 

Exercise 2.1: Parallel Parking using Lie Brackets. 




Figure 2.5: Co-ordinates for car parking 
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The configuration space of a car is four dimensional, and parameterized by 
co-ordinates (x,y,9,(j)), as shown in figure 2.5. 

Define the following vector fields: 

a) (front wheel) drive = cos </>(cos 9 d x + sin 9 d y ) + sin cf> <%. 

b) steer = 8$. 

c) (front wheel) skid = — sin 0(cos 9 d x + sin 9 d y ) + cos <fi d$. 

d) park = — sin9d x + cos6d y . 

Explain why these are apt names for the vector fields, and compute the Lie 
brackets: 

[steer, drive] , [steer, skid] , [skid, drive] , 
[park, drive] , [park, park] , [park, skid] . 

The driver can use only the operations (±) drive and (±) steer to manouvre 
the car. Use the geometric interpretation of the Lie bracket to explain how a 
suitable sequence of motions (forward, reverse, and turning the steering wheel) 
can be used to manoeuvre a car sideways into a parking space. 

2.2.2 Lie Derivative 

Another derivative we can define is the Lie derivative along a vector field X. 
It is denned by its action on a scalar function / as 

C x f = Xf, (2.30) 

on a vector field by 

C X Y^[X,Y], (2.31) 

and on anything else by requiring it to be a derivation, meaning that it obeys 
Leibniz' rule. For example, let us compute the Lie derivative of a covector 
F. We first introduce an arbitrary vector field Y and plug it into F to get 
the scalar function F(Y). Leibniz' rule is then the statement that 

C X F(Y) = (C X F)(Y) + F(C X Y). (2.32) 

Since F(Y) is a function and Y a vector, both of whose derivatives we know 
how to compute, we know two of the three terms in this equation. From 
C X F(Y) = XF(Y) and F(C X Y) = F([X,Y]), we have 



XF(Y) = (C X F)(Y) + F([X,Y}), (2.33) 
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and so 

(C X F)(Y) = XF{Y) - F([X,Y}). (2.34) 
In components, this becomes 

(C X F)(Y) = X»d u (F^) - F V (X^Y V - Y< i d„X v ) 

= (X»d u F^ + F u d^)Y». (2.35) 

Note how all the derivatives of Y^ have cancelled, so C X F( ) depends only 
on the local value of Y. The Lie derivative of F is therefore still a covector 
field. This is true in general: the Lie derivative does not change the tensor 
character of the objects on which it acts. Dropping the passive spectator 
field Y v , we have a formula for C X F in components: 

{C X F), = X v d v F„ + F v d il X v . (2.36) 

Another example is provided by the Lie derivative of a type (0, 2) tensor, 
such metric tensor. This is 

{C x g), v = X a d a9fiu + g m d v X<* + g au d,X a . (2.37) 

The Lie derivative of a metric measures the extent to which the displacement 
x a — > x a + eX a (x) deforms the geometry. If we write the metric as 

g{ , )= g^{x) dx» ® dx\ (2.38) 

we can understand both this geometric interpretation and the origin of the 
three terms appearing in the Lie derivative. We simply make the displace- 
ment x a — > x a + eX a in the coefficients g^uix) and in the two dx a . In the 
latter we write 

3X a 

d(x a + eX a ) = dx a + e—--dxP. (2.39) 

ox" 

Then we see that 

g^{x) dx» ® dx v -> [g^{x) + e(X a d a9tlu + g m d v X a + g av d^X a )\ dx" ® dx v 
= [g^ + e(C x g) flu }dx^®dx u . (2.40) 

A displacement field X that does not change distances between points, i.e. 
one that gives rise to an isometry, must therefore satisfy C x g = 0. Such an 
X is said to be a Killing field after Wilhelm Killing who introduced them 
in his study of non-euclidean geometries. 
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The geometric interpretation of the Lie derivative of a vector field is as 
follows: In order to compute the X directional derivative of a vector field Y, 
we need to be able to subtract the vector Y(x) from the vector Y(x + eX), 
divide by e, and take the limit e — > 0. To do this we have somehow to get the 
vector Y(x) from the point x, where it normally resides, to the new point 
x + eX, so both vectors are elements of the same vector space. The Lie 
derivative achieves this by carrying the old vector to the new point along the 
field X. 




x 



Figure 2.6: Computing the Lie derivative of a vector. 

Imagine the vector Y as drawn in ink in a flowing fluid whose velocity field 
is X. Initially the tail of Y is at x and its head is at x + Y. After flowing 
for a time e, its tail is at x + eX — i.e exactly where the tail of Y(x + eX) 
lies. Where the head of transported vector ends up depends how the flow has 
stretched and rotated the ink, but it is this distorted vector that is subtracted 
from Y(x + eX) to get eC x Y = e[X, Y]. 

Exercise 2.2: The metric on the unit sphere equipped with polar co-ordinates 
is 

g ( , ) = d9(g) dO + sin 2 0d<j> ® dep. 

Consider 

V x = — sin (fidg — cot 9 cos (pd^, 

the vector field of a rigid rotation about the x axis. Compute the Lie derivative 
Cy x g, and show that it is zero. 

Exercise 2.3: Suppose we have an unstrained block of material in real space. 
A co-ordinate system £ , £ 2 , £ 3 , is attached to the atoms of the body. The 
point with co-ordinate £ is located at (x 1 (£), x 2 (£), £ 3 (£)) where x 1 , x 2 , x 3 are 
the usual R 3 Cartesian co-ordinates. 
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a) Show that the induced metric in the £ co-ordinate system is 



9^(0 = 



dx a dx a 



Qtp Qcv 

a=l 

b) The body is now deformed by an infinitesimal strain vector field ??(£). 
The atom with co-ordinate £ M is moved to what was £ M + f/ t (£), or equiv- 
alently, the atom initially at Cartesian co-ordinate x a (£) is moved to 
x a + rf dx a I 'd^ . Show that the new induced metric is 

c) Define the strain tensor to be 1/2 of the Lie derivative of the metric 
with respect to the deformation. If the original £ co-ordinate system 
coincided with the Cartesian one, show that this definition reduces to 
the familiar form 

1 / dr) a dri b 



&ab 2 \dx b + 8x a 

all tensors being Cartesian, 
d) Part c) gave us the geometric definitition of infinitesimal strain. If the 
body is deformed substantially, the Cauchy-Green finite strain tensor is 
defined as 

where gffl is the metric in the undeformed body and that of the 
deformed body. Explain why this is a reasonable definition. 



2.3 Exterior Calculus 
2.3.1 Differential Forms 

The objects we introduced in section 2.1, the dx^, are called one-forms, or 
differential one-forms. They are fields living in the cotangent bundle T*M 
of M. More precisely, they are sections of the cotangent bundle. Sections 
of the bundle whose fibre above x G M is the p-th skew-symmetric tensor 
power /\ P (T*M X ) of the cotangent space are known as p- forms. 
For example, 



A = A^dx^ = Aidx 1 + A 2 dx 2 + A 3 dx 3 , 



(2.41) 
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is a 1-form, 
2 

is a 2-form, and 



A dx v = F 12 dx l A cfe 2 + F 23 dx 2 A cfe 3 + F 31 dx 3 A drc 1 , (2.42) 



Q = dx^ A dx" A dx a 



= Q 123 dx 1 A dx 2 A dx 3 , (2.43) 
is a 3-form. All the coefficients are skew-symmetric tensors, so, for example, 

Qfiua ^vo-fi ^CTiiV ^Ufia ^fiau ^aufi- (2.44) 

In each example we have explicitly written out all the independent terms for 
the case of three dimensions. Note how the p\ disappears when we do this 
and keep only distinct components. In d dimensions the space of p-forms is 
d\/p\(d — p)\ dimensional, and all p-forms with p > d vanish identically. 

As with the wedge products in chapter one, we regard a p-form as a p- 
linear skew-symetric function with p slots into which we can drop vectors to 
get a number. For example the basis two-forms give 

dx» A dx»(d a , dp) = 8£8Z - Sffi. (2.45) 

The analogous expression for a j9-form would have p\ terms. We can define 
an algebra of differential forms by "wedging" them together in the obvious 
way, so that the product of a p form with a q form is a (p + g)-form. The 
wedge product is associative and distributive but not, of course, commuta- 
tive. Instead, if a is a p-form and b a g-form, then 

aAb=(-l) pq bAa. (2.46) 

Actually it is customary in this game to suppress the "A" and simply write 
F = \Ffj_v dx^dx", it being assumed that you know that dx^dx" = —dx v dx tl 
— what else could it be? 



2.3.2 The Exterior Derivative 



These p-forms may seem rather complicated, so it is perhaps surprising that 
all the vector calculus (div, grad, curl, the divergence theorem and Stokes' 
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theorem, etc.) that you have learned in the past reduce, in terms of them, 
to two simple formulae! Indeed Elie Cartan's calculus of p-forms is slowly 
supplanting traditional vector calculus, much as Willard Gibbs' and Oliver 
Heaviside's vector calculus supplanted the tedious component-by-component 
formulae you find in Maxwell's Treatise on Electricity and Magnetism. 

The basic tool is the exterior derivative "gP, which we now define ax- 
iomatically: 

i) If / is a function (0-form), then df coincides with the previous defini- 
tion, i.e. df(X) = Xf for any vector field X. 

ii) d is an anti-derivation: If a is a j9-form and b a g-form then 

d(aAb) =daAb+ (-IfaAdb. (2.47) 

iii) Poincare's lemma: d 2 = 0, meaning that d(da) = for any p-form a. 

iv) d is linear. That d(aa) = ada, for constant a follows already from i) 
and ii), so the new fact is that d(a + b) — da + db. 

It is not immediately obvious that axioms i), ii) and iii) are compatible 
with one another. If we use axiom i), ii) and d(dx l ) = to compute the d of 
Q = -^fL. i dx n ■ ■ ■ dx lp , we find 

da = ^(dn iu ..., ip )dx h ---dx i * 

= ^d k n ilt ... iip dx k dx il ---dx i *. (2.48) 



Now compute 



d(<m) = ^ (did k tt h) ... tip ) dx l dx k dx h ■ ■ ■ dx ip . (2.49) 



Fortunately this is zero because didk^l = d k diQ, while dx l dx k = —dx k dx l . 
If A = A x dx x + A 2 dx 2 + A 3 dx 3 then 

= -F^dx^dx", (2.50) 
where 

Ffiv = d,A u - d v A il . (2.51) 
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You will recognize the components of curl A hiding in here. 
Similarly, if F — F 12 dx 1 dx 2 + F 2 ^dx 2 dx 2, + F 31 dx 3 dx 1 then 

dF = + ¥f + ¥f 1 ^ 1 ^ 2 ^ 3 - (2-52) 

This looks like a divergence. 

The axiom d 2 = encompasses both "curlgrad = 0" and "divcurl = 
0" , together with an infinite number of higher-dimensional analogues. The 
familiar "curl =Vx", meanwhile, is only defined in three dimensional space. 

The exterior derivative takes p-forms to (p+1) -forms i.e. skew-symmetric 
type (0,p) tensors to skew-symmetric (0,p + 1) tensors. How does "d" get 
around the fact that the derivative of a tensor is not a tensor? Well, if 
you apply the transformation law for A^, and the chain rule to to find 
the transformation law for F^ = d^Ay — d v A^, you will see why: all the 
derivatives of the |^ cancel, and F^ u is a bona- fide tensor of type (0, 2). This 
sort of cancellation is why skew-symmetric objects are useful, and symmetric 
ones less so. 

Exercise 2.4: Use axiom ii) to compute d(d(aAb)) and confirm that it is zero. 
Closed and exact forms 

The Poincare lemma, d 2 = 0, leads to some important terminology: 

i) A p-form oo is said to be closed if duo = 0. 

ii) A p-form uj is said to exact if uj = dr\ for some (p — l)-form r\. 

An exact form is necessarily closed, but a closed form is not necessarily exact. 
The question of when closed =^ exact is one involving the global topology of 
the space in which the forms are defined, and will be subject of chapter 4. 

Cartan's formulae 

It is sometimes useful to have expressions for the action of d coupled with 
the evaluation of the subsequent (p + 1) forms. 

If f,r),uj, are 0, 1,2-forms, respectively, then df,dr],du, are 1,2,3-forms. 
When we plug in the appropriate number of vector fields X, Y, Z, then, after 
some labour, we will find 



df(X) = Xf. 



(2.53) 
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d V (X,Y) = X V (Y)-Yri(X)-7i([X,Y]). (2.54) 
du(X,Y,Z) = Xcu(Y,Z)+Ycu(Z,X) + Zcu(X,Y) 

-u([X, Y],Z)- u([Y, Z],X)- uj([Z, X],Y). (2.55) 

These formulae, and their higher-p analogues, express d in terms of geometric 
objects, and so make it clear that the exterior derivative is itself a geometric 
object, independent of any particular co-ordinate choice. 

Let us demonstate the correctness of the second formula. With rj = r/^dx 11 , 
the left-hand side, drj(X, Y), is equal to 

d^ u dx^dx u {X, Y) = d^ u (X»Y u - X U Y»). (2.56) 

The right hand side is equal to 

X»d^ v Y v ) - V^d^X") - Vv(X^Y» - Y^X"). (2.57) 

On using the product rule for the derivatives in the first two terms, we find 
that all derivatives of the components of X and Y cancel, and are left with 
exactly those terms appearing on left. 

Exercise 2.5: Let u % , i = 1, . . . ,r be a linearly independent set of one- forms 
defining a Pfaffian system (see sec. 2.2.1) in d dimensions. 

i) Use Cartan's formulas to show that the corresponding {d — r )-dimensional 
distribution is involutive if and only if there is an r-by-r matrix of 1-forms 
9 l j such that 

r 
3=1 

ii) Show that the conditions in part i) are satisfied if there are r functions 
g l and an invertible r-by-r matrix of functions pj such that 

r 
3=1 

In this case foliation surfaces are given by the conditions g l {x) = const., 
i = l,...,r. 

It is also possible, but considerably harder, to show that i) =>• ii). Doing so 
would constitute a proof of Frobenius' theorem. 

Exercise 2.6: Let uj be a closed two- form, and let Null(u;) be the space of 
vector fields X such that u(X, ) = 0. Use the Cartan formulas to show that 
ifX,Ye NuU(w), then [X,Y] G NuU(w). 
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Lie Derivative of Forms 

Given a p-form ui and a vector field X, we can form a (p — l)-form called 
ixui by writing 

p slots 

i x u(^_.) = (2-58) 

p—1 slots p—1 slots 

Acting on a 0-form, %x is defined to be 0. This procedure is called the interior 
multiplication by X. It is simply a contraction 

Ujih-iv ~* u kj 2 ...j p X k , (2.59) 

but it is convenient to have a special symbol for this operation. It is perhaps 
surprising that ix turns out to be an ant i- derivation, just as is d. If i] and uj 
are p and q forms respectively, then 

i x ( v Aw) = (i xr) ) Au + (-l) p r] A (i x u), (2.60) 

even though i x involves no differentiation. For example, if X = X^d^, then 

i x (dx» Adx u ) = dx^ Adx u (X a d a , ), 
= X^dx u -dx^X u , 

= {ixdx") A (dx u ) -dx» A (i x dx u ). (2.61) 

One reason for introducing ix is that there is a nice (and profound) 
formula for the Lie derivative of a p-form in terms of ix- The formula is 
called the infinitesimal homotopy relation. It reads 

£ x uj = (di x + ixd)uj. (2.62) 

This formula is proved by verifying that it is true for functions and one- 
forms, and then showing that it is a derivation - in other words that it 
satisfies Leibniz' rule. From the derivation property of the Lie derivative, we 
immediately deduce that that the formula works for any p-form. 

That the formula is true for functions should be obvious: Since %xj — 
by definition, we have 



(di x + i x d)f = ixdf = df(X) = Xf = C x f. (2.63) 
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To show that the formula works for one forms, we evaluate 

(di x +i x d)(f„dx») = d(f u xn+t x (d,udx»dxn 

= d IM (f v X v )dx^ + d IM f v (X' i dx v -X v dx») 

= (X v d v f ll + f v d li X v )dx' i . (2.64) 

In going from the second to the third line, we have interchanged the dummy 
labels n «-> v in the term containing dx v . We recognize that the 1-form in 
the last line is indeed C x f. 

To show that di x + ixd is a derivation we must apply di x + i x d to a A b 
and use the ant i- derivation property of i x and d. This is straightforward once 
we recall that d takes a p-form to a (p + l)-form while i x takes a p-form to 
a (p — l)-form. 

Exercise 2.7: Let 

w = -rWi 1 ...i_dx il ---da; < »'. 
Use the anti-derivation property of ix to show that 

(p-1)! 

and so verify the equivalence of (2.58) and (2.59). 

Exercise 2.8: Use the infinitesimal homotopy relation to show that C and d 
commute, i.e. for oj a p-form, we have 

d{C x uj) = C x (duj). 



2.4 Physical Applications 
2.4.1 Maxwell's Equations 

In relativistic 3 four-dimensional tensor notation the two source-free Maxwell's 
equations 



curl E = — 




divB = 0, (2.65) 

3 In this section we will use units in which c = eo = /Uo = 1- We take the Minkowski 
metric to be = diag (—1, 1, 1, 1) where x° — t, x 1 = x , etc. 
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reduce to the single equation 



dF, v + dF^ + dF Xfl 



dx x dx^ 



dx v 



0. 



where 



( 


-E x 


-Ey 


-E z 


E x 





B z 


-By 


Ey 


-B z 





B x 


\E Z 


By 


—B x 






(2.66) 



(2.67) 



/ 



The "F" is traditional, for Michael Faraday. In form language, the relativistic 
equation becomes the even more compact expression dF = 0, where 



F = 



-F^dafdx" 



= B x dydz + Bydzdx + B z dxdy + E x dxdt + E y dydt + E z dzdt, 



(2.68) 



is a Minkowski-space 2-form. 



Exercise 2.9: Verify that the source- free Maxwell equations are indeed equiv- 
alent to dF = 0. 



The equation dF = is automatically satisfied if we introduce a 4-vector 
1-form potential A = —<f)dt + A x dx + A y dy + A z dz and set F = dA. 
The two Maxwell equations with sources 



divD = 
curl H = 



j + 



<9D 



reduce in 4-tensor notation to the single equation 



d^F^ = r. 



(2.69) 



(2.70) 



Here J M = (p, j) is the current 4-vector. 

This source equation takes a little more work to express in form language, 
but it can be done. We need a new concept: the Hodge "star" dual of a form. 
In d dimensions the "*" map takes a p-form to a (d — p)-form. It depends 
on both the metric and the orientation. The latter means a canonical choice 
of the order in which to write our basis forms, with orderings that differ 
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by an even permutation being counted as the same. The full <i-dimensional 
definition involves the Levi-Civita duality operation of chapter 1 , combined 
with the use of the metric tensor to raise indices. Recall that ^fg = ^det g^. 
(In Minkowski-signature metrics we should replace y/g by \f-g~-) We define 
"*" to be a linear map 

v ( d ~p) 
* : /\(T*M) -> /\ (T*M) (2.71) 

such that 

* d.r' . . . dx** = j^y V9g lin ■ ■ ■ 9 ipjp e h ... jpjp+v .. jd dx>^ . . . da*. (2.72) 

Although this definition looks a trifle involved, computations involving it are 
not so intimidating. The trick is to work, whenever possible, with oriented 
orthonormal frames. If we are in euclidean space and {e** 1 , e* 12 , . . . , e** d } is an 
ordering of the orthonormal basis for (T*M) X whose orientation is equivalent 
to {e*\e* 2 , . . . ,e* d } then 

* (e" 1 A e" 2 A • • • A e* ip ) = e*^ +1 A e""+ 2 A ■ ■ • A e* id . (2.73) 

For example, in three dimensions, and with x, y, z, our usual Cartesian co- 
ordinates, we have 

•kdx = dydz, 
■kdy = dzdx, 

-kdz = dxdy. (2.74) 

An analogous method works for Minkowski-signature (—,+,+,-1-) metrics, 
except that now we must include a minus sign for each negatively normed 
dt factor in the form being "starred." Taking {dt, dx, dy, dz) as our oriented 
basis, we therefore find 4 



-k dxdy = 


—dzdt, 


•k dydz = 




* dzdx = 


—dydt, 


•k dxdt = 


dydz, 


* dydt = 




•k dzdt = 


dxdy. 



(2.75) 



J See for example: Misner, Thorn and Wheeler, Gravitation, (MTW) page 108. 
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For example, the first of these equations is derived by observing that (dxdy) (—dzdt) = 
dtdxdydz, and that there is no u dt" in the product dxdy. The fourth fol- 
lows from observing that that (dxdt)(—dydx) = dtdxdydz, but there is a 
negative-normed u dt" in the product dxdt. 
The * map is constructed so that if 

a = ^a hi2 .. Ap dx il dx i2 ■ ■ ■ dx ip , (2.76) 

and 

P = j^i ll2 ... ip dx ildxl2 ■ ■ ■ dxip ' ( 2 - 77 ) 

then 

a A =0 A (*a) = (a,f3)a, (2.78) 
where the inner product (a, (3) is defined to be the invariant 

(a,p) = ^gwg™ ■ ••//'^'", ; ,,.,, i JlJv .., h . (2.79) 
and a is the volume form 

a = ^g~dx l dx 2 ■ ■ ■ dx d . (2.80) 

In future we will write a * (3 for ct A Bear in mind that the "*" in this 
expression is acting /3 and is not some new kind of binary operation. 
We now apply these ideas to Maxwell. From the field-strength 2-form 

F = B x dydz + B y dzdx + B z dxdy + E x dxdt + E y dydt + E z dzdt, (2.81) 

we get a dual 2-form 

*F = —B x dxdt — Bydydt — B z dzdt + E x dydz + E y dzdx + E z dxdy. (2.82) 

We can check that we have correctly computed the Hodge star of F by taking 
the wedge product, for which we find 

F ★ F = \{F^F^)o- = {Bl + B 2 y + B 2 Z - E 2 X - E 2 y - E 2 z )dtdxdydz. (2.83) 

Observe that the expression B 2 -E 2 is a Lorentz scalar. Similarly, from the 
current 1-form 

J = J^dx^ = —pdt + j x dx + jydy + j z dz, (2.84) 
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we derive the dual current 3-form 

•kJ = p dxdydz — j x dtdydz — j y dtdzdx — j z dtdxdy, (2.85) 
and check that 

J -k J = ( J M J M )(7 = (-p 2 + f x + j 2 y + f z )dtdxdydz. (2.86) 

Observe that 

d* J = + divj^ dtdxdydz = 0, (2.87) 

expresses the charge conservation law. 

Writing out the terms explicitly shows that the source-containing Maxwell 
equations reduce to d*F — -kJ. All four Maxwell equations are therefore very 
compactly expressed as 

dF = 0, d*F = *J. 

Observe that current conservation d*J = follows from the second Maxwell 
equation as a consequence of d 2 = 0. 

Exercise 2.10: Show that for a p-form uj in d euclidean dimensions we have 

**w = {-iy {d ~^uj. 

Show, further, that for a Minkowski metric an additional minus sign has to be 
inserted. (For example, -k-kF = —F, even though (-1) 2 ( 4 ~ 2 ) = +1.) 



2.4.2 Hamilton's Equations 

Hamiltonian dynamics takes place in phase space, a manifold with co-ordinates 
(q 1 , . . . , q n 1 p 1 1 . . . ,p n ). Since momentum is a naturally covariant vector 5 , 
phase space is usually the co-tangent bundle T*M of the configuration man- 
ifold M. We are writing the indices on the p's upstairs though, because we 
are considering them as co-ordinates in T*M. 

We expect that you are familiar with Hamilton's equation in their q,p 
setting. Here, we shall describe them as they appear in a modern book on 
Mechanics, such as Abrahams and Marsden's Foundations of Mechanics, or 
V. I. Arnold's Mathematical Methods of Classical Mechanics. 

5 To convince yourself of this, remember that in quantum mechanics = —ih-J^:, and 
the gradient of a function is a covector. 
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Phase space is an example of a symplectic manifold, a manifold equiped 
with a symplectic form — a non-degenerate 2-form field 

oj = -Uijdx l dx j . (2.88) 

Recall that the word closed means that dui = 0. Non- degenerate means that 
for any point x the statement that w(X, Y) — for all vectors Y G TM X 
implies that X = at that point (or equivalently that for all x the matrix 
ujij(x) has an inverse u/ J '(x)). 

Given a Hamiltonian function if on our symplectic manifold, we define 
a velocity vector-field vh by solving 

dif = -i VH u = -u{v H , ) (2.89) 

for vh- If the symplectic form is uo = dp 1 dq l + dp 2 dq 2 + h dp n dq n , this is 

nothing but a fancy form of Hamilton's equations. To see this, we write 

an f)M 

dH = —dq> + —dp* (2.90) 

and use the customary notation (q\p l ) for the velocity-in-phase-space com- 
ponents, so that 

v «=?w +f w- {2M) 



Now we work out 



pdq* - q l dp\ (2.92) 



so, comparing coefficients of dp 1 and dq % on the two sides of dH = —i VH u, we 
read off 

Darboux' theorem, which we will not try to prove, says that for any point x 
we can always find co-ordinates p, q, valid in some neigbourhood of x, such 
that oj = dp l dq l + dp 2 dq 2 + ■ ■ ■ dp n dq n . Given this fact, it is not unreasonable 
to think that there is little to gained by using the abstract differential-form 
language. In simple cases this is so, and the traditional methods are fine. 
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It may be, however, that the neigbourhood of x where the Darboux co- 
ordinates work is not the entire phase space, and we need to cover the space 
with overlapping p, q co-ordinate charts. Then, what is a p in one chart 
will usually be a combination of p's and g's in another. In this case, the 
traditional form of Hamilton's equations loses its appeal in comparison to 
the co-ordinate-free dH = —i VH ou. 

Given two functions Hi, H2 we can define their Poisson bracket {Hi, H2}. 
Its importance lies in Dirac's observation that the passage from classical 
mechanics to quantum mechanics is accomplished by replacing the Poisson 
bracket of two quantities, A and B, with the commutator of the correspond- 
ing operators A, and B: 

i[A, B] < — ► h{A, B} + O (ft 2 ) . (2.94) 

We define the Poisson bracket by 6 

= v Hl H 2 . (2.95) 

Hi 

Now, vh 1 H 2 = dH 2 (vH 1 ), and Hamilton's equations say that dH 2 (vH 1 ) = 
uj(v Hi ,v H2 ). Thus, 

{Hi,H 2 } = uo{v Hl ,v H2 ). (2.96) 

The skew symmetry of uj(vh 1 ,vh 2 ) shows that despite the asymmetrical ap- 
pearance of the definition we have skew symmetry: {Hi, H 2 } = —{H 2 , Hi}. 
Moreover, since 

v Hl (H 2 H 3 ) = (v Hl H 2 )H 3 + H 2 (v Hl H 3 ), (2.97) 

the Poisson bracket is a derivation: 

{Hi, H 2 H 3 } = {Hi, H 2 }H 3 + H 2 {H U H 3 }. (2.98) 

Neither the skew symmetry nor the derivation property require the con- 
dition that duj = 0. What does need u to be closed is the Jacobi identity: 

{{Hi, H 2 }, H 3 } + {{H 2 , H 3 }, Hi} + {{H 3 , Hi}, H 2 } = 0. (2.99) 

6 Our definition differs in sign from the traditional one, but has the advantage of mini- 
mizing the number of minus signs in subsequent equations. 



{Hi,H 2 } 



def 



dH 2 



dt 
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We establish Jacobi by using Cartan's formula in the form 

du (vhi , vh 2 ,vh 3 ) = v Hi u(vh 2 , vh 3 ) +v h . 2 uj (vh 3 ,vhi) + v Ha w(vhi, vh 2 ) 

-oj([v Hl ,v Ha ],v Ha ) -uj([v Ha ,v H3 ],v Hl ) -u([vh 3 ,v Hi ],v H2 )- 

(2.100) 

It is relatively straight-forward to interpret each term in the first of these 
lines as Poisson brackets. For example, 

vhMvh„vh 3 ) = v Hl {H 2 ,H 3 } = {H U {H 2 ,H 3 }}. (2.101) 

Relating the terms in the second line to Poisson brackets requires a little 
more effort. We proceed as follows: 

v([v Hi ,vh 2 ],vh 3 ) = -w(v Ha ,[v Hl ,v H2 \) 

= dH 3 ([v Hl ,v H2 \) 

= [v Hl ,v H2 ]H 3 

= v Hl (v H2 H 3 ) - v H2 (v Hl H 3 ) 

= {^,{^2,^3}} -{^,{^1,^3}} 

= {HuiH^HM + iH^iH^H!}}. (2.102) 
Adding everything togther now shows that 

= dw(v Hl ,v H2 ,v H3 ) 

-{{H u H 2 }, H 3 } - {{H 2 , H 3 }, H,} - {{H 3 , H,}, H 2 }. (2.103) 

If we rearrange the Jacobi identity as 

{H u {H 2 , H 3 }} - {H 2 , {H u H 3 }} = {{H u H 2 }, H 3 }, (2.104) 

we see that it is equivalent to 

[v Hl ,v H2 ] = v {Hu h 2} . 

The algebra of Poisson brackets is therefore homomorphic to the algebra of 
the Lie brackets. The correspondence is not an isomorphism, however: the 
assignment H 1— > vh fails to be one-to-one because constant functions map 
to the zero vector field. 

Exercise 2.11: Use the infinitesimal homotopy relation, to show that C VH u = 
0, where vh is the vector field corresponding to H. Suppose now that the phase 
space is 2n dimensional. Show that in local Darboux co-ordinates the 2n-form 
uj n /n\ is, up to a sign, the phase-space volume element d n pd n q. Show that 
C VH u n /n\ = and that this result is Liouville's theorem on the conservation 
of phase-space volume. 
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The classical mechanics of spin 

It is sometimes said in books on quantum mechanics that the spin of an elec- 
tron, or other elementary particle, is a purely quantum concept and cannot 
be described by classical mechanics. This statement is false, but spin is the 
simplest system in which traditional physicist's methods become ugly and it 
helps to use the modern symplectic language. A "spin" S can be regarded 
as a fixed length vector that can point in any direction in IR 3 . We will take 
it to be of unit length so that its components are 



where 9 and are polar co-ordinates on the two-sphere S 2 . 

The surface of the sphere turns out to be both the configuration space 
and the phase space. In particular the phase space for a spin is not the 
cotangent bundle of the configuration space. This has to be so: we learned 
from Niels Bohr that a 2n-dimensional phase space contains roughly one 
quantum state for every h n of phase-space volume. A cotangent bundle 
always has infinite volume, so its corresponding Hilbert space is necessarily 
infinite dimensional. A quantum spin, however, has a finite- dimensional 
Hilbert space so its classical phase space must have a finite total volume. 
This finite-volume phase space seems un-natural in the traditional view of 
mechanics, but it fits comfortably into the modern symplectic picture. 

We want to treat all points on the sphere alike, and so it is natural to take 
the symplectic 2-form to be proportional to the element of area. Suppose that 
uj = sin 9 d9d(f). We could write uj = dcos9d<p and regard as "g" and cos 6* 
as "p' (Darboux' theorem in action!), but this identification is singular at the 
north and south poles of the sphere, and, besides, it obscures the spherical 
symmetry of the problem, which is manifest when we think of uj as rf(area). 

Let us take our hamiltonian to be H = BS X , corresponding to an applied 
magnetic field in the x direction, and see what Hamilton's equations give for 
the motion. First we take the exterior derivative 




sin 9 cos 0, 
sin 9 sin 0, 
cos#, 



(2.105) 



d(BS x ) = B(cos 9 cos 4>d9 — sin 9 sin 0o?0) 



(2.106) 



This is to be set equal to 



■cj(v BSx , ) = v e {-sin9)d(j) + v 4 ' sin 9d9. 



(2.107) 
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Comparing coefficients of d6 and d(f>, we get 

V(bs x ) = v 6 d e + v% = B(sm(j)de + cos cot 0fy), (2.108) 

i.e. -B times the velocity vector for a rotation about the x axis. This velocity 
field therefore describes a steady Larmor precession of the spin about the 
applied field. This is exactly the motion predicted by quantum mechanics. 
Similarly, setting B — 1, we find 

v Sy = — cos 0<9,9 + sin cot 98^, 

v Sz = -fy. (2.109) 

From these velocity fields we can compute the Poisson brackets: 

{S x ,S y } = u(v Sx ,v Sy ) 

= sin 9d9d(p(sin (pdg + cos cot #(9^, — cos 4>8q + sin cot 0c^) 

= sin 9 (sin 2 cot + cos 2 cot #) 

= cos 9 = S z . 

Repeating the exercise leads to 

{S X , Sy} = S Z , 
{Sy, S Z } = S X , 

{S Z ,S X } = Sy. (2.110) 

These Poisson brackets for our classical "spin" are to be compared with the 
commutation relations [S x , S y ] = ihS z etc. for the quantum spin operators 
Si. 



2.5 Covariant Derivatives 

Covariant derivatives are a general class of derivatives that act on sections 
of a vector or tensor bundle over a manifold. We will begin by considering 
derivatives on the tangent bundle, and in the exercises indicate how the idea 
generalizes to other bundles. 
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2.5.1 Connections 

The Lie and exterior derivatives require no structure beyond that which 
comes for free with our manifold. Another type of derivative that can act on 
tangent-space vectors and tensors is the covariant derivative Vx = AT^V^. 
This requires an additional mathematical object called an affine connection. 
The covariant derivative is defined by: 

i) Its action on scalar functions as 

V x f = Xf. (2.111) 

ii) Its action a basis set of tangent- vector fields e a (x) = e^(x)d l _ l (a local 
frame, or vielbein 7 ) by introducing a set of functions u/- fc (x) and setting 

Ve.e^ = eiJ jk . (2.112) 

ii) Extending this definition to any other type of tensor by requiring Vx 
to be a derivation. 

iii) Requiring that the result of applying V x to a tensor is a tensor of the 
same type. 

The set of functions uj l ^ k {x) is the connection. In any local co-ordinate chart 
we can choose them at will, and different choices define different covariant 
derivatives. (There may be global compatibility constraints, however, which 
appear when we assemble the charts into an atlas.) 

Warning: Despite having the appearance of one, u/- fe is not a tensor. It 
transforms inhomogeneously under a change of frame or co-ordinates — see 
equation (2.131). 

We can, of course, take as our basis vectors the co-ordinate vectors e M = 
<9 M . When we do this it is traditional to use the symbol T for the co-ordinate 
frame connection instead of uj. Thus, 

V M e, = Ve M e, = e A rV (2.113) 

The numbers T x u ^ are often called Christoffel symbols. 

As an example consider the covariant derivative of a vector f u e u . Using 
the derivation property we have 

V^re.) = (^De. + fV^ 
= (d„r)e„ + f u e x Y\, 
= e u {d,r + f x T\,}. (2.114) 

7 In practice viel, "many" , is replaced by the appropriate German numeral: ein-, zwei-, 
drei-, vier-, fiinf-, . . ., indicating the dimension. The word bein means "leg." 
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In the first line we have used the defining property that V eM acts on the 
functions f v as <9 M , and in the last line we interchanged the dummy indices 
v and A. We often abuse the notation by writing only the components, and 
set 

v M r = V + / A rv (2.ii5) 

Similarly, acting on the components of a mixed tensor, we would write 

V M A% 7 = d lx A a Pl + V a ^A x Pl - T X p lx A a \ 1 - r A 7/i AV (2.116) 

When we use this notation, we are no longer regarding the tensor components 
as "functions." 

Observe that the plus and minus signs in (2.116) are required so that, for 
example, the covariant derivative of the scalar function f a g a is 

= {d li fa-f>X x a ^g a + f a {d ll g a + g x T\,) 

= (V M / a ) g a + f a (V^ a ) , (2.117) 
and so satisfies the derivation property. 



Parallel transport 

We have defined the covariant derivative via its formal calculus properties. 
It has, however, a geometrical interpretation. As with the Lie derivative, in 
order to compute the derivative along X of the vector field Y, we have to 
somehow carry the vector Y(x) from the tangent space TM X to the tangent 
space TM x+e x, where we can subtract it from Y(x + eX) . The Lie derivative 
carries Y along with the X flow. The covariant derivative implicitly carries 
Y by "parallel transport". If 7 : s 1— > x^(s) is a parameterized curve with 
tangent vector X^d^, where 

X» = (2.118) 
as 

then we say that the vector field F(x M (s)) is parallel transported along the 
curve 7 if 

V x Y = 0, (2.119) 
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at each point x ,x (s). Thus, a vector that in the vielbein frame e, at x has 
components Y % will, after being parallel transported to x + eX, end up com- 
ponents 

Y i -eJ jk Y j X k . (2.120) 

In a co-ordinate frame, after parallel transport through an infinitesimal dis- 
placement <5x M , the vector Y v d v will have components 

Y" ^Y" -V^Y^x", (2.121) 

and so 

SafVpY" = Y u {x» + 5x») - {Y v {x) -Y\ lx Y x 8x lx } 

= 5x"gr + rv A }. (2.122) 

Curvature and Torsion 

As we said earlier, the connection uj l ^ k {x) is not itself a tensor. Two important 
quantities which are tensors, are associated with V x'- 

i) The torsion 

T{X,Y) =V X Y -V Y X -[X,Y]. (2.123) 

The quantity T(X, Y) is a vector depending linearly on X, Y, so T at 
the point a; is a map TM X x TM X — > TM X) and so a tensor of type 
(1,2). In a co-ordinate frame it has components 

T — r ^„ r (2.124) 

ii) The Riemann curvature tensor 

R(X, Y)Z = VxVyZ - VyVzZ - V [x ,y] z - (2.125) 

The quantity R(X,Y)Z is also a vector, so R(X,Y) is a linear map 
TM X — > TMr, and thus i? itself is a tensor of type (1,3). Written out 
in a co-ordinate frame, we have 

R a /3nK = d^T a p v — d l) T a p tl + T a x^T X f3 U — T a \ v ,T x l 3^. (2.126) 

If our manifold comes equipped with a metric tensor g^ v (and is thus 
a Riemann manifold), and if we require both that T = and V^^ = 0, 
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then the connection is uniquely determined, and is called the Riemann, or 
Levi-Civita, connection. In a co-ordinate frame it is given by 

TV = \d aX (d„9x, + d v9tiX - d x9fiu ) . (2.127) 

This is the connection that appears in General Relativity. 

The curvature tensor measures the degree of path dependence in parallel 
transport: if Y v (x) is parallel transported along a path 7 : s h a; M (s) from 
a to b, and if we deform 7 so that a; M (s) — > x^(s) + 5x^(s) while keeping the 
endpoints a, b fixed, then 

6Y a (b) = - j R a ^ u {x)Y^{x)5x^dx v . (2.128) 

J a 

If R a f3^v = then the effect of parallel transport from a to b will be indepen- 
dent of the route taken. 

The geometric interpretation of is less transparent. On a two-dimensional 
surface a connection is torsion free when the tangent space "rolls without 
slipping" along the curve 7. 

Exercise 2.12: Metric compatibility. Show that the Riemann connection 

TV = \9 aX (d^9\u + d u g^ x - d x g^) ■ 

follows from the torsion-free condition T 01 ^ = T a Ufl together with the metric 
compatibility condition 

V M 5 Q/ 3 = dp, g a p — Van g u p — Y v ' afl g av = 0. 

Show that "metric compatibility" means that that the operation of raising or 
lowering indices commutes with covariant derivation. 

Exercise 2.13: Geodesic equation. Let 7 : s i-> x fJ, (s) be a parametrized 
path from a to b. Show that the Euler-Lagrange equation that follows from 
minimizing the distance functional 

b 



S{l) = / \Jg il yX^x v ds, 

J a 

where the dots denote differentiation with respect to the parameter s, is 

dV^ +rM dx a dx _ Q 
ds 2 al3 ds ds 
Here T^ a/ 3 is the Riemann connection (2.127). 
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Exercise 2.14: Show that if is a vector field then, for the Riemann connec- 
tion, 

v A u _ 1 d ^ A * 

In other words, show that 

Deduce that the Laplacian acting on a scalar field (p can be defined by setting 
either 

or 

yj~g ox^ \ ox v 
the two definitions being equivalent. 



2.5.2 Cartan's Form Viewpoint 

Let e* j (x) = e* j fl (x)dx fM be the basis of one-forms dual to the vielbein frame 
ei(x) = ef(x)d li . Since 

<5*=e«(e,) = e>f, (2.129) 

the matrices e*^ and ef are inverses of one-another. We can use them to 
change from roman vielbein indices to greek co-ordinate frame indices. For 
example: 

9a = g( e i, e i) = efg^ej, (2.130) 

and 

= e«(dtf)e£ + e*\e)e»T\,. (2.131) 

Cartan regards the connection as being a matrix fl of one-forms with 
entries = u^^dx^. In this language equation (2.112) becomes 

Vxej = e^.(X). (2.132) 

Cartan's viewpoint separates off the index ri, which refers to the direction 
5x^ oc in which we are differentiating, from the matrix indices % and 
j that act on the components of the vector or tensor being differentiated. 
This separation becomes very natural when the vector space spanned by the 
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ej(x) is no longer the tangent space, but some other "internal" vector space 
attached to the point x. Such internal spaces are common in physics, an im- 
portant example being the "colour space" of gauge field theories. Physicists, 
following Hermann Weyl, call a connection on an internal space a "gauge po- 
tential." To mathematicians it is simply a connection on the vector bundle 
that has the internal spaces as its fibres. 

Cartan also regards the torsion T and curvature R as forms; in this case 
vector- and matrix-valued two-forms, respectively, with entries 

T l = \r\ v dx»dx\ (2.133) 

R\ = \R\^dx\ (2.134) 

In his form language the equations defining the torsion and curvature become 
Cartan' s structure equations: 

de* 4 + oj) A e* j = T\ (2.135) 

and 

dcu\ + cu) Acu j k = R\. (2.136) 
The last equation can be written more compactly as 

dQ + fl Afl = R. (2.137) 

From this, by taking the exterior derivative, we obtain the Bianchi identity 

dm — RAf2 + f2AR = 0. (2.138) 

On a Riemann manifold, we can take the vielbein frame e, to be orthonor- 
mal. In this case the roman-index metric = g(ei,ej) becomes Sij. There 
is then no distinction between covariant and contravariant roman indices, 
and the connection and curvature forms, Q, R, being infinitesimal rotations, 
become skew symmetric matrices: 



tOij ^jii Rij Rji- 



(2.139) 
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2.6 Further Exercises and Problems 

Exercise 2.15: Consider the vector fields X = yd x , Y = d y in M 2 . Find the 
flows associated with these fields, and use them to verify the statements made 
in section 2.2.1 about the geometric interpretation of the Lie bracket. 

Exercise 2.16: Show that the pair of vector fields L z = xd y — yd x and L y = 
zd x — xd z in R 3 is in involution wherever they are both non-zero. Show further 
that the general solution of the system of partial differential equations 

{xdy -yd x )f = 0, 
(xd z - zd x )f = 0, 

in M 3 is f(x, y, z) = F(x 2 + y 2 + z 2 ), where F is an arbitrary function. 

Exercise 2.17: In the rolling conditions (2.26) we are using the "Y" convention 
for Euler angles. In this convention 8 and <j) are the usual spherical polar co- 
ordinate angles with respect to the space- fixed xyz axes. They specify the 
direction of the body-fixed Z axis about which we make the final tp rotation. 




Figure 2.7: Euler angles: we first rotate the ball through an angle <p about 
the z axis, thus taking y Y', then through 9 about Y' , and finally through 
if) about Z, so taking Y' — > Y. 

a) Show that (2.26) are indeed the no-slip rolling conditions 
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where (u x , uj y , uj z ) are the components of the ball's angular velocity in 
the xyz space-fixed frame, 
b) Solve the three constraints in (2.26) so as to obtain the vector fields 



c) Show that 

[roll x , roll y ] = — spin z , 

where spin z = d^, corresponds to a rotation about a vertical axis through 
the point of contact. This is a new motion, being forbidden by the co z = 
condition. 

d) Show that 



where the new vector fields 

spin x = -(roll y -d y ), 
spin y = (rollx-Sr), 

correspond to rotations of the ball about the space-fixed x and y axes 
through its centre, and with the centre of mass held fixed. 

We have generated five independent vector fields from the original two. There- 
fore, by sufficient rolling to-and-fro, we can position the ball anywhere on the 
table, and in any orientation. 

Exercise 2.18: The semi-classical dynamics of charge — e electrons in a mag- 
netic solid are governed by the equations 8 



Here k is the Bloch momentum of the electron, r is its position, e(k) its band 
energy (in the extended-zone scheme), and B(r) is the external magnetic field. 
The components fij of the Berry curvature fi(k) are given in terms of the 
periodic part |u(k)) of the Bloch wavefunctions of the band by 



(2.27). 



[spin z , roll x ] = spin x , 
[spin z , rolly] = spin y , 



k 



r 





du du 
dkj dkk 




du du 
dkk dkj 




8 M. C. Chang, Q. Niu, Phys. Rev. Lett. 75 (1995) 1348. 



72 



CHAPTER 2. DIFFERENTIAL CALCULUS ON MANIFOLDS 



The only property of fi(k) needed for the present problem, however, is that 
divkfi = 0. 

a) Show that these equations are Hamiltonian, with 

H(r,k) =e(k) + y(r) 

and with 

e 1 

uj = dkidxi - -€ ijk Bi(r)dxjdx k + -e ijk Qi(k)dkjdk k . 

as the symplectic form. 9 

b) Confirm that the uj defined in part b) is closed, and that the Poisson 
brackets are given by 



{Xj, Xj} 

{fci, kj\ 



(1 + eB-O)' 
5jj + QjBj 
(1 + eB-O)' 

^ijkB k 



(1 + eB-O) 

c) Show that the conserved phase-space volume w 3 /3! is equal to 

(1 + eB • ft)d 3 kd 3 x, 
instead of the naively expected d 3 kd 3 x. 

The following pair of exercises show that Cartan's expression for the curva- 
ture tensor remains valid for covariant differentiation in "internal" spaces. 
There is, however, no natural concept analogous to the torsion tensor for 
internal spaces. 

Exercise 2.19: Non-abelian gauge fields as matrix-valued forms. In a non- 
abelian Yang-Mills gauge theory, such as QCD, the vector potential 

A = A^ 

is matrix- valued, meaning that the components A^ are matrices which do not 
necessarily commute with each other. (These matrices are elements of the Lie 



9 C. Duval, Z. Horvath, P. A. Horvathy, L. Martina, P. C. Stichel, Modern Physics 
Letters B 20 (2006) 373. 
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algebra of the gauge group, but we won't need this fact here.) The matrix- 
valued curvature, or field-strength, 2-form F is defined by 

F = dA + A 2 = -F uv dx^dx v . 

Here a combined matrix and wedge product is to be understood: 



i) Show that A 2 = ^[^4^, A u ]dx >1 dx u } and hence show that 

Fpv = d^A u — d u A^ + [A^, A v \. 

ii) Define the gauge-covariant derivatives 

V M = df, + Ap, 

and show that the commutator [Vu, V„] of two of these is equal to F^. 
Show further that if X, Y are two vector fields with Lie bracket [X, Y] 
and Vx = X^V M , then 

F(X,Y) = [V x ,Vy}-V [x>y] . 

iii) Show that F obeys the Bianchi identity 

dF — FA + AF = 0. 

Again wedge and matrix products are to be understood. This equation 
is the non-abelian version of the source-free Maxwell equation dF = 0. 

iv) Show that, in any number of dimensions, the Bianchi identity implies 
that the 4-form tr(F 2 ) is closed, i.e. that dtr (F 2 ) = 0. Similarly show 
that the 2n-form tr (F n ) is closed. (Here the "tr" means a trace over the 
roman matrix indices, and not over the greek space-time indices.) 

v) Show that, 

tr (F 2 ) = d jtr (jidA + ^A^j j . 

The 3-form tr (AdA + §A 3 ) is called a Chern-Simons form. 

Exercise 2.20: Gauge transformations. Here we consider how the matrix- 
valued vector potential transforms when we make a change of gauge. In other 
words, we seek the non-abelian version of A^ — > A^ + d^cf). 
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i) Let g be an invertable matrix, and 5g a matrix describing a small change 
in g. Show that the corresponding change in the inverse matrix is given 
by S(g- 1 ) = -g- 1 (Sg)g- 1 . 

ii) Show that under the gauge transformation 

A ^ A9 = g -l Ag + g -l dgj 

we have F — ► g~ 1 Fg. (Hint: The labour is minimized by exploiting the 
covariant derivative identity in part ii) of the previous exercise), 
hi) Deduce that tr(F") is gauge invariant. 

iv) Show that a necessary condition for the matrix-valued gauge held A to 
be "pure gauge", i.e. for there to be a position dependent matrix g such 
that A = g~ 1 dg, is that F = 0, where F is the curvature two-form of the 
previous exercise. 

In a gauge theory based on a Lie group G, the matrices g will be elements of 
the group, or, more generally, they will form a matrix representation of the 
group. 



Chapter 3 

Integration on Manifolds 



One usually thinks of integration as requiring measure - a notion of volume, 
and hence of size and length, and so a metric. A metric however is not 
required for integrating differential forms. They come pre-equipped with 
whatever notion of length, area, or volume is required. 

3.1 Basic Notions 
3.1.1 Line Integrals 

Consider, for example, the form df. We want to try to give a meaning to the 
symbol 



Here T is a path in our space starting at some point P an d ending at the point 
Pi. Any reasonable definition of I\ should end up with the answer we would 
immediately write down if we saw an expression like I\ in an elementary 
calculus class. This answer is 



No notion of a metric is needed here. There is however a geometric picture of 
what we have done. We draw in our space the surfaces ...,/(#) = —l,f(x) = 
0,f(x) = 1,..., and perhaps fill in intermediate values if necessary. We 
then start at Pq and travel from there to Pi, keeping track of how many of 




(3.1) 




(3.2) 
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these surfaces we pass through (with sign -1, if we pass back through them). 
The integral of df is this number. Figure 3.1 illustrates a case in which 
| r d/ = 5.5 -1.5 = 4. 




f=l 2 3 4 5 6 



Figure 3.1: The integral of a one-form 



What we have defined is a signed integral. If we parameterise the path as 
x(s), < s < 1, and with x(0) = Po, = Pi we have 

'-I©* (3 - 3) 

where the right hand side is an ordinary one- variable integral. It is important 
that we did not write I g | in this integral. The absence of the modulus sign 
ensures that if we partially retrace our route, so that we pass over some part 
of r three times — twice forward and once back — we obtain the same answer 
as if we went only forward. 



3.1.2 Skew-symmetry and Orientations 

What about integrating 2 and 3-forms? Why the skew-symmetry? To answer 
these questions, think about assigning some sort of "area" in R 2 to the par- 
allelogram defined by the two vectors x, y. This is going to be some function 
of the two vectors. Let us call it u;(x, y). What properties do we demand of 
this function? There are at least three: 

i) Scaling: If we double the length of one of the vectors, we expect the 

area to double. Generalizing this, we demand cj(Ax, /iy) = (A/i)c<j(x, y). 

(Note that we are not putting modulus signs on the lengths, so we are 

allowing negative "areas", and for the sign to change when we reverse 

the direction of a vector.) 
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ii) Additivity: The drawing in figure 3.2 shows that we ought to have 

w(xi + x 2 , y) = w(xi, y) + w(x 2 , y), (3.4) 
similarly for the second slots. 




Figure 3.2: Additivity of u;(x, y). 



iii) Degeneration: If the two sides coincide, the area should be zero. Thus 
a>(x, x) = 0. 

The first two properties, show that u should be a multilinear form. The 
third shows that it must be skew-symmetric! 

= ^(x + y,x + y) = w(x,x) +^(x,y) + w(y,x) + w(y,y) 

= w(x,y) +w(y,x). (3.5) 

So 

w(x,y) = -w(y,x). (3.6) 

These are exactly the properties possessed by a 2-form. Similarly, a 3-form 
outputs a volume element. 

These volume elements are oriented. Remember that an orientation of a 
set of vectors is a choice of order in which to write them. If we interchange 
two vectors, the orientation changes sign. We do not distinguish orientations 
related by an even number of interchanges. A p-form assigns a signed (±) 
p-dimensional volume element to an orientated set of vectors. If we change 
the orientation, we change the sign of the volume element. 

Orientable and Non-orientable Manifolds 

In the classic video game Asteroids you could select periodic boundary con- 
ditions so that your spaceship would leave the right-hand side of the screen 
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/ 





a) 



b) 



Figure 3.3: A spaceship leaves one side of the screen and returns on the other 
with a) torus boundary conditions, b) projective-plane boundary conditions. 
Observe how, in case b), the spaceship has changed from being left handed 
to being right-handed. 

and re-appear on the left. The game universe was topologically a torus T 2 . 
Suppose that we modify the game code so that each bit of the spaceship 
re-appears at the point diametrically opposite the point it left. This does not 
seem like a drastic change until you play a game with a left-hand-drive (US) 
spaceship. If you send the ship off the screen and watch as it re-appears on the 
opposite side, you will observe the ship transmogrify into a right-hand-drive 
(British) craft. If we ourselves made such an excursion, we would end up 
starving to death because all our left-handed digestive enzymes would have 
been converted to right-handed ones. The manifold we have constructed is 
topologically equivalent to the real projective plane 1LP 2 . The lack of a global 
notion of being left or right-handed makes it an example of a non-orientable 
manifold. 

A manifold or surface is orientable if we can choose a global orientation 
for the tangent bundle. The simplest way to do this would be to find a 
smoothly varying set of basis- vector fields, e^(x), on the surface and define 
the orientation by chosing an order, e^x), e 2 (a;), . . . , e^(a;), in which to write 
them. In general, however, a globally-defined smooth basis will not exist 
(try to construct one for the two-sphere, S 2 \). We will, however, be able to 
find a continously varying orientated basis e±\x), e^\x), . . . , e^\x) for each 
member, labelled by (i), of an atlas of coordinate charts. We should chose 
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the charts so the intersection of any pair forms a connected set. Assuming 
that this has been done, the orientation of pair of overlapping charts is said 
to coincide if the determinant, det A, of the map ej^ = -A^e^ relating the 
bases in the region of overlap, is positive. 1 If bases can be chosen so that all 
overlap determinants are positive, the manifold is orientable and the selected 
bases define the orientation. If bases cannot be so chosen, the manifold or 
surface is non- orientable. 

Exercise 3.1: Consider a three-dimensional ball B 3 with diametrically oppo- 
site points of its surface identified. What would happen to an aircraft flying 
through the surface of the ball? Would it change handedness, turn inside out, 
or simply turn upside down? Is this ball an orientable 3-manifold? 



3.2 Integrating p- Forms 

A p-form is naturally integrated over an oriented p-dimensional surface or 
manifold. Rather than start with an abstract definition, We will first explain 
this pictorially, and then translate the pictures into mathematics. 



3.2.1 Counting Boxes 

To visualize integrating 2-forms let us try to make sense of 



where f2 is an oriented region embedded in three dimensions. The surfaces 
/ = const, and g = const, break the space up into a series of tubes. The 
oriented surface Q cuts these tubes in a two-dimensional mesh of (oriented) 
parallelograms. 



1 The determinant will have the same sign in the entire overlap region. If it did not, 
continuity and connectedness would force it to be zero somewhere, implying that one of 
the putative bases was not linearly independent there 




(3.7) 
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Figure 3.4: The integration region cuts the tubes into parallelograms. 

We define an integral by counting how many parallelograms (including frac- 
tions of a parallelogram) there are, taking the number to be positive if the 
parallelogram given by the mesh is oriented in the same way as the surface, 
and negative otherwise. To compute 



we do the same, but weight each parallelogram, by the value of h at that 
point. The integral J n fdxdy, over a region in M. 2 thus ends up being the 
number we would compute in a multivariate calculus class, but the integral 
j n fdydx, would be minus this. Similarly we compute 



of the 3-form dfdgdh over the oriented volume H, by counting how many 
boxes defined by the surfaces f,g,h= constant, are included in S. 

An equivalent way of thinking of the integral of a p-form uses its definition 
as a skew-symmetric p-linear function. Accordingly we evaluate 




(3.8) 




(3.9) 




(3.10) 



where w is a 2-form, and f2 is an oriented 2-surface, by plugging vectors 
into it). We tile the surface Q with collection of (small) parallelograms, each 
defined by an oriented pair of basis vectors vi and V2. 
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Figure 3.5: We tile f2 with small oriented parallelograms and compute 



At each base point x we insert these vectors into the 2-form (in the order spec- 
ified by their orientation) to get w(vi, V2), and then sum the resulting num- 
bers to get I 2 - Similarly, we integrate p-form over an oriented dimensional 
region by decomposing the region into infinitesimal p-dimensional oriented 
parallelepipeds, inserting their defining vectors into the form, and summing 
their contributions. 

3.2.2 Relation to conventional integrals 

The previous section explained how to think pictorially about the integral. 
Here we interpret the pictures as multi-variable calculus. 

We begin by motivating our recipe by considering a change of variables 
in an integral in M. 2 . Suppose we set X\ = x(yi, y 2 ), x 2 = x 2 (yi, 1/2) in 



Ezen^Vi^VaOc)). 





and use 



dx 2 



dx 



1 



dx 



dy l 

dx 2 




dy 1 + TT^ d y 2 - 



(3.12) 



Since dy l dy 2 



dy 2 dy 1 , we have 




dx 1 dx 2 dx 2 dx 1 

Qyl Qy2 Qyl Qy2 



) 



dy 1 dy 2 . 



(3.13) 
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Thus 



/ 

Jn 



f(x)dx 1 dx 2 = I f{x{y)) d ^fQ-dy l dy 2 (3.14) 



where Irr^ is the Jacobean determinant 



^(x 1 ,^ 1 ) _ f dx 1 dx 2 dx 2 dx 1 



d(y\y 2 ) \dy l dy 2 dy 1 dy 2 



(3.15) 



and Q' the integration region in the new variables. There is therefore no need 
to include an explicit Jacobean factor when changing variables in an integral 
of a p-form over a p- dimensional space — it comes for free with the form. 



This observation leads us to the general prescription: To evaluate 



the integral of a p-form 



oo = ^u tll ^ p dx^---dx^ (3.16) 

over the region Q of a p dimensional surface in a d > p dimensional space, 
substitute a paramaterization 

x 1 = x\e,e,---,n 



x d = x d (e,e,---,e), (3.17) 

of the surface into u>. Next, use 

dx» = — df , (3.18) 

so that 

u - uix®)^..^ ■ ■ ■ °-^de ■ ■ ■ de, (3.i9) 

which we regard as a p-form on Q. (Our customary l/p\ is absent here 
because we have chosen a particular order for the d£'s.) Then 

^ u ^ ^ u(x(0^,., P ^r ■ ■ ■ ■ ■ ■ de, (3.20) 

where the right hand side is an ordinary multiple integral. This recipe is a 
generalization of the formula (3.3) which reduced the integral of a one-form 
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to an ordinary single-variable integral. Because the appropriate Jacobean 
factor appears automatically, the numerical value of the integral does not 
depend on the choice of parameterization of the surface. 
Example: To integrate the 2-form xdydz over the surface of a two dimen- 
sional sphere of radius R, we parameterize the surface with polar angles as 

x = i?sin0sin#, 
y = R cos 4> sin 9, 

z = Rcos9. (3.21) 

Then 

dy = — .R sin sin 6>g?0 + R cos (ft cos 9 d9, 

dz = -Rsin9d9, (3.22) 

and so 

x dydz = R 3 sm 2 (p sin 3 # d(pd9. (3.23) 

We therefore evaluate 



/ xdydz = R 3 / sin 2 sin 3 9 d(f>d9 

isphere JO JO 

= R 3 sin 2 0rf0 / sin 3 #d# 
Jo Jo 



R 3 ix j (1 - cos 2 9) d cos 9 

^R 3 . (3.24) 



The volume form 



Although we do not need any notion of length to integrate a differential 
form, a p-dimensional surface embedded or immersed in M d does inherit a 
distance scale from the M. d Euclidean metric, and this is used to define the 
area or volume of the surface. When the Cartesian co-ordinates x 1 , . . . ,x d 
of a point in the surface are given as i"^ 1 ,..., £ p ), where the . . . , £ p , are 
co-ordinates on the surface, then the inherited, or induced, metric is 



(3.25) 
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where 

EOx a dx a . . 

WW ( } 

a=l 

The volume form associated with the induced metric is 

d(Volume) = ^d^- • -df p , (3.27) 

where g = det (g^ v ). The integral of this p-form over the surface gives the 
area, or p-dimensional volume, of the surface. 

If we change the parameterization of the surface from £ M to neither 
the d^ 1 ■ ■ ■ d£ p nor the yjg are separately invariant, but the Jacobean arising 
from the change of the p-form, d^ 1 ■ ■ ■ d£ p — > d^ 1 ■ ■ ■ d( p cancels against the 
factor coming from the transformation law of the metric tensor g^ v — > g' 
leading to 

V9de---de=y/g 1 d( 1 ---d( p . (3.28) 

The volume of the surface is therefore independent of the co-ordinate system 
used to evaluate it. 

Example: The induced metric on the surface of a unit-radius two-sphere 
embedded in IR 3 , is, expressed in polar angles, 

"rfs 2 " = g ( , )=d0®d6 + sin 2 # d(j) <g> d(f). 

Thus 



9 

and 



1 
sin 2 9 



sin 2 9, 
d(Area) = sin 8 d9d<p. 



3.3 Stokes' Theorem 

All the integral theorems of classical vector calculus are special cases of 
Stokes' Theorem: If dVt denotes the (oriented) boundary of the (oriented) 
region Q, then 
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We will not provide a detailed proof. Apart from notation, it would 
parallel the proof of Stokes' or Green's theorems in ordinary vector calculus: 
The exterior derivative d has been defined so that the theorem holds for 
an infinitesimal square, cube, or hypercube. We therefore divide Q into 
many such small regions. We then observe that the contributions of the 
interior boundary faces cancel because all interior faces are shared between 
two adjacent regions, and so occur twice with opposite orientations. Only 
the contribution of the outer boundary remains. 
Example: If SI is a region of M 2 , then from 



d 

we have 



7}(x dy -ydx) 



dxdy, 



Area(f2)= / dxdy = - f (xdy — ydx). 
Jn 2 J dn 

Example: Again, if SI is a region of M 2 , then from d[r 2 d6/2] = r drdO we have 

Area(fi) = / rdrdO = - f r 2 d9. 
Jn 2 J dQ 

Example: If Q is the interior of a sphere of radius R, then 

f 4 i 

\xdydz = x dydx = -irR . 

n Jan 3 



Here we have referred back to (3.24) to evaluate the surface integral. 
Example: Archimedes' tombstone. 

Archimedes of Syracuse gave instructions that his tombstone should have 
displayed on it a diagram consisting of a sphere and circumscribed cylinder. 
Cicero, while serving as quaestor in Sicily, had the stone restored. 2 This 
has been said to be the only significant contribution by a Roman to pure 
mathematics. The carving on the stone was to commemorate Archimedes' 
results about the areas and volumes of spheres, including the one illustrated 
in figure 3.6, that the area of the spherical cap cut off by slicing through the 
cylinder is equal to the cut off on the cylinder. 

We can understand this result via Stokes' theorem: If the two-sphere S 2 
is parameterized by spherical polar co-ordinates 0, <p, and Q is a region on 



2 Marcus Tullius Cicero, Tusculan Disputations, Book V, Sections 64 — 66 
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Figure 3.6: Sphere and circumscribed cylinder. 



the sphere, then 



Area(fi) = / sin OdQdcf) = / (1 - cos 
Jn Jan 

and applying this to the figure, where the cap is defined by 6 < 9 gives 



Area (cap) = 2n(l — cos#o) 



which is indeed the of the blue cylinder. 



Exercise 3.2: The sphere S n can be thought of as the locus of points in M. n+ 
obeying Ym=1 ( xl ) 2 = 1- U se hs invariance under orthogonal transformations 
to show that the element of surface "volume" of the n-sphere can be written 

as 

^(Volume on S n ) = ^e QlQ2 ... Qn+1 x ai dx a2 . ..dx a "+\ 

Use Stokes' theorem to relate the integral of this form over the surface of the 
sphere to the volume of the solid unit sphere. Confirm that we get the correct 
proportionality between the volume of the solid unit sphere and the volume 
or area of its surface. 
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3.4 Applications 

We now know how to integrate forms. What sort of forms should we seek 
to integrate? For a physicist working with a classical or quantum field, a 
plentiful supply of infesting forms is obtained by using the field to pull back 
geometric objects. 

3.4.1 Pull-backs and Push-forwards 

If we have a map <j) from a manifold M to another manifold N, and we choose 
a point x G M, we can push forward a vector from TM X to TNm x \, in the 
obvious way (map head-to- head and tail-to-tail). This map is denoted by 
0* : TM X -> TN m . 




Figure 3.7: Pushing forward a vector X from TM X to TN^ x y 



If the vector X has components X^ and the map takes the point with coor- 
dinates x M to one with coordinates the vector <p*X has components 

(4>*Xf = (3.29) 

This looks very like the transformation formula for contravariant vector com- 
ponents under a change of coordinate system. What we are doing here is 
conceptually different, however. A change of co-ordinates produces a passive 
transformation — i.e. a new description for an unchanging vector. A push 
forward is an active transformation — we are changing a vector into differ- 
ent one. Furthermore, the map from M — > N is not being assumed to be 
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one-to-one, so, contrary to the requirement imposed on a co-ordinate trans- 
formation, it may not be possible to invert the functions and write the 
x v, s as functions of the £ M 's. 

While we can push forward individual vectors, we cannot always push 
forward a vector field X from TM to TN. If two distinct points x 1 and x 2 , 
chanced to map to the same point £ G N, and X(xi) ^ X(x 2 ), we would not 
know whether to chose or 0*[X(x2)] as This problem 

does not occur for differential forms. A map : M — > N induces a natural, 
and always well defined, pull-back map <p* : f\ p (T*N) -> f\ p (T*M) which 
works as follows: Given a form lu E f\ p (T*N), we define (jfuo as a form on M 
by specifying what we get when we plug the vectors X\, X 2 , . . . , X p G TM 
into it. We evaluate the form at x G M by pushing the vectors forward 
from TM X to TN^ X ), plugging them into uu at (f>(x) and declaring the result 
to be the evaluation of (jfuo on the X { at x. Symbolically 

[(jfuo] (X ± , X 2 , . . . , X p ) = 0*X 2 , . . . , 0,X P ). (3.30) 

This may seem rather abstract, but the idea is in practice quite simple: 
If the map takes x G M — > £(x) G N, and 

co = ^u tl ... lp (Ode i ---de p , (3.31) 



then 



<f>*u = io; ili ,., p [£(x)]dr(a;)^(x)...^(x) 



1 <9f n <9,P 2 <9f* p 

= ^^KWl^r^-^-"^- < 3 ' 32 > 

Computationally, the process of pulling back a form is so transparent that 
it easy to confuse it with a simple change of variable. That it is not the same 
operation will become clear in the next few sections where we consider maps 
that are many-to-one. 

Exercise 3.3: Show that the operation of taking an exterior derivative com- 
mutes with a pull back: 

d[(f)*uj} = 4>*(du). 

Exercise 3.4: If the map <j) : M — > N is invertible then we may push forward 
a vector field X on M to get a vector field (f)*X on N. Show that 
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Exercise 3.5: Again assume that <fi : M — ► N is invertible. By using the co- 
ordinate expressions for the Lie bracket and the effect of a push-forward, show 
that if X, Y are vector fields on TM then 

m[x,y}) = [0*x,^n 

as vector fields on TN. 

3.4.2 Spin textures 

As an application of pull-backs we will consider some of the topological as- 
pects of spin textures which are fields of unit vectors n, or "spins" , in two or 
three dimensions. 

Consider a smooth map n : IR 2 -> S 2 that assigns x i— > n(x), where n is a 
three-dimensional unit vector whose tip defines a point on the 2-sphere S 2 . 
A physical example of such an n(x) would be the local direction of the spin 
polarization in a ferromagnetically-coupled two-dimensional electron gas. 

In terms of n, the area 2-form on the sphere becomes 

1 1 

Q = -n • (dn x dn) = -e ijk n l dn j dn k . (3.33) 

The n map pulls this area-form back to 

F . n -n = \ { e m n%nW X ^ = (.^n'ftrfft^) ^ (3.34) 

which is a differential form in IR 2 . We will call it the topological charge 
density. It measures the area on the two-sphere swept out by the n vectors 
as we explore a square in IR 2 of side dx 1 by dx 2 . 

Suppose now that the vector n tends some fixed direction at large dis- 
tance. This allows us to think of "infinity" as a single point, and the assign- 
ment x i — y n(x) as a map from S* 2 to S 2 . Such maps are characterized topo- 
logically by their "topological charge, " or winding number N which counts 
the number of times the image of the originating x sphere wraps round the 
target n-sphere. A mathematician would call this number the Brouwer de- 
gree of the map n. It is intuitively plausible that a continuous map from a 
sphere to itself will wrap a whole number of times, and so we expect 

N=—[ {e ijk n i d l n j d 2 n k \dx 1 dx 2 ) (3.35) 
4?r J R 2 
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to be an integer. We will soon show that this is indeed so, but first we will 
demonstrate that iV is a topological invariant. 

In two dimensions the form F = n*Q is automatically closed because 
the exterior derivative of any two-form is zero — there being no three-forms 
in two dimensions. Even if we consider an n(a; 1 , . . . , x m ) field in m > 2 
dimensions, however, we still have dF = 0. This is because 

dF = -e ijk d a n i d f ,n j d u n k dx a dx^dx u . (3.36) 

If we insert infinitesimal vectors into the dx^ to get their components Sx^, 
we have to evaluate the triple-product of three vectors Sn % = d^n l Sx^, each 
of which is tangent to the two-sphere. But the tangent space of S 2 is two- 
dimensional, so any three tangent vectors t 1; t 2 , t 3 , are linearly dependent 
and their triple-product ti • (t 2 x t 3 ) is zero. 

Although it is closed, F = n*Vl will not generally be the d of a globally 
defined one-form. Suppose, however, that we vary the map, n — > n + Sn. 
The change in the topological charge density is 

5F = n* [n • (d(Sn) x dn)] , (3.37) 

and this variation can be written as a total derivative 

SF = d{n*[n ■ (Sn x dn)]} = d{e ijk n i 5n j d l ,n k dx f1 }. (3.38) 

In these manipulations we have used Sn ■ (dn x dn) = dn ■ (Sn x dn) = 0, the 
triple-products being zero for the same reason adduced earlier. From Stokes' 
theorem, we have 

SN= [ SF= [ eijkrfSnid^dx^. (3.39) 
Js 2 JdS 2 

Since OS 2 = 0, we conclude that SN = under any smooth deformation of 
the map n(x). This is what we mean when we say that iV is a topological 
invariant. Equivalently, on M. 2 , with n constant at infinity, we have 

5N= [ 5F= [ e ijk n i Sn j d fl n k dx ,x , (3.40) 

where T is a curve surrounding the origin at large distance. Again SN = 0, 
this time because d^n k = everywhere on T. 
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In some physical applications, the field n winds in localized regions called 
Skyrmions . These knots in the spin field behave very much as elementary 
particles, retaining their identity as they move through the material. The 
winding number counts how many Skyrmions (minus the number of anti- 
Skyrmions, which wind with opposite orientation) there are. To construct a 
smooth multi-Skyrmion map 1R 2 — > S 2 with positive winding number N, take 
a set of iV + 1 complex numbers A, a±, . . . , a at and another set of N numbers 
bi, . . . ,bp{ such that no b coincides with any a. Then set 

e^an^A ^t'f (3.41) 
2 (z-b 1 )...(z-b N ) v ; 

where z = x 1 + ix 2 , and 9 and are spherical polar co-ordinates specifying 
the direction n. At the points the vector n points straight up, and at the 
points bi it points straight down. You will show in exercise 3.12 that this 
particular n-field configuration minimizes the energy functional 

E[n] = - J (din ■ c^n + <9 2 n • <9 2 n) dx 1 dx 2 

= ^ (iVn 1 !^ |Vn 2 | 2 + |WT) dx l dx 2 (3.42) 

for the given winding number N. The next section will explain the geometric 
origin of the mysterious combination e l<?!> tan#/2. 



3.4.3 The Hopf Map 

You may recall that in section 1.2.3 we defined complex projective space 
CP n to be the set of rays in a complex n + 1 dimensional vector space. 
A ray is an equivalence classes of vectors [Ci, C2, • • • , Cn+i], where the Q are 
not all zero, and where we do not distinguish between [(1,(2, ■■■ , Cn+i] an d 
\(2, ■ ■ ■ , ACn+i] for non-zero A. The space of rays is a 2n-dimensional real 
manifold: in a region where ( n+ i does not vanish, we can take as co-ordinates 
the real numbers £1, . . . , £„, rji, . . . , r\ n where 

t\ + «7i = 7^-, 6 + ii}2 = 7^-, • • • , in + iVn = ■ (3.43) 

Sn+l Sn+1 Sn+1 

Similar co-ordinate charts can be constructed in the regions where other Q are 
non-zero. Every point in CP™ lies in at least one of these co-ordinate charts, 
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and the co-ordinate transformation rules for going from chart to another are 
smooth. 

The simplest complex projective space, CP 1 , is the real two-sphere S 2 in 
disguise. This rather non-obvious fact is revealed by the use of a stereographic 
map to make the equivalence class [Ci, Ca] £ CP 1 correspond to a point n on 
the sphere. When d is non zero, the class [Ci ? C2] is uniquely determined by 
the ratio C2/C1 = | C2 / Ci I , which we plot on the complex plane. We think 
of this copy of C as being the x, y plane in M. 3 . We then draw a straight line 
connecting the plotted point to the south pole of a unit sphere circumscribed 
about the origin in M. 3 . The point where this line (continued if necessary) 
intersects the sphere is the tip of the unit vector n. 



CA=C 






y 
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\ — — — / 
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< 


/c 





► X 



Figure 3.8: Two views of the sterographic map between the two-sphere and 
the complex plane. The point ( = C2/C1 £ C corresponds to the unit vector 
neS 2 . 



If (2, were zero, we would end up at the north pole where the M. 3 co-ordinate 
z takes the value z = 1. If (1 goes to zero with (2 fixed, we move smoothly to 
the south pole z — — 1. We therefore extend the definition of our map to the 
case (1 = by making the equivalence class [0, (2} correspond to the south 
pole. We can find an explicit formula for this map. Figure 3.8 shows that 
C2/C1 — e J *tan6 l /2, and this relation suggests the use of the "i" -substitution 
formulae 

2t 1 — t 2 

sin^^, cos9 = —, (3.44) 

where t = tan 6/2. Since the x, y, z components of n are given by 



n 



sin 6 cos 1 



3.4. APPLICATIONS 



93 



n 2 = sin 9 sin 0, 
n 3 = cos#, 

we find that 

r} | ^ _ 2(C 2 /Cl) ,3_ 1-K 2 /Cl| 2 f3 45) 

+m "i + ic 2 /Cii 2 ' "i + ic 2 /Cir (3 5) 

We can multiply through by |(j| 2 = CiGj an d so write this correspondence 
in a more symmetrical manner: 

n i = C1C2 + C2C1 



ICil 2 _+IC 2 | 2 _ 

2 1 ( C1C2 - C2C1 



i VKi| 2 + IC 2 | 
n 3 = (3-46) 



This last form can be conveniently expressed in terms of the Pauli sigma 
matrices 

as 



- 1 - ;)(*) 



n 2 = (z 1 ,z 2 ) 



-A /^i 

1 7 U 2 



where 



-1 J \z 2 



W VKiP + IGI 2 VC2 



n 3 = (*i,* 2 )[ J ° ) ( f 1 }, (3.48) 



(3.49) 



is a normalized 2-vector, which we think of as a spinor. 

The CP 1 ~ S* 2 correspondence now has a quantum mechanical interpre- 
tation: Any unit three-vector n can be obtained as the expectation value 
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of the a matrices in a normalized spinor state. Conversly, any normalized 
spinor tp — (z±, z<i) T gives rise to a unit vector via 

rj = ^o-V- (3.50) 

Now, since 

l = \z 1 \ 2 + \z 2 \ 2 , (3.51) 

the normalized spinor can be thought of as defining a point in S* 3 . This 
means that the one-to-one correspondence [-21,-22] n also gives rise to a 
map from S 3 — > S 2 . This is called the Hopf map: 

Hopf : S 3 -> S 2 . (3.52) 

The dimension reduces from three to two, so the Hopf map cannot be one-to- 
one. Even after we have normalized [Ci^L we are still left with a choice of 
overall phase. Both (2:1,2:2) and (zie , z 2 e ), although distinct points in S* 3 , 
correspond to the same point in CP 1 , and hence in S 2 . The inverse image 
of a point in S 2 is a geodesic circle in S* 3 . Later we will show that any two 
such geodesic circles are linked, and this makes the Hopf map topologically 
non-trivial in that it cannot be continuously deformed to a constant map, 
i.e. to a map that takes all of S* 3 to a single point in S 2 . 

Exercise 3.6: We have seen that the stereographic map relates the point with 
spherical polar co-ordinates 9, (j) to the complex number 

C = e** tan 0/2. 

We can therefore set £ = £ + ir) and take £, 77 as stereographic co-ordinates on 
the sphere. Show that in these co-ordinates the sphere metric is given by 

g( , ) = d8<g)d6 + sin 2 # d<f> <g> dcf> 

1 _ _ 

;(d(® d( + d(® d() 



(1 + ICI 2 ) 2 
4 

(d£ <g)d£ + dri(g> drj), 



(1+e + H 2 ) 2 

and the area 2-form becomes 

n = sin 6 d6 A d<f> 
2* „ 
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3.4.4 Homotopy and the Hopf map 

We can use the Hopf map to factor the map n : x h- > n(a;) through the three- 
sphere by specifying the spinor ip at each point, instead of the vector n, and 
so mapping indirectly 

]g,2 g3 H _^P f g2 

It might seem that for a given spin-field n(x) we can choose the overall 
phase of ip(x) = (zi(x), Z2{x)) T as we like, but if we demand that the z^s be 
continuous functions of x there is a rather non-obvious topological restriction 
which has important physical consequences. To see how this comes about we 
first express the winding number in terms of the Zj. We find (after a page or 
two of algebra) 

2 2 

F = (eij k n l din j d 2 n k ) dx 1 dx 2 = - ^ (d{Zid 2 Zi — d 2 ZidiZi) dx 1 dx 2 , (3.54) 

1 i=i 

and so the topological charge N is given by 

N = J (diZid 2 Zi - d 2 z i d 1 z i ) dx 1 dx 2 . (3.55) 

i=l 

Now, when written in terms of the Zi variables, the form F becomes a total 
derivative: 

2 2 



1 i=i 



F — — ^2 {d\Zid 2 Zi - d 2 ZidiZi) dx 1 dx 1 
=i 

1 2 1 



i=l 



Further, because n is fixed at large distance, we have (zi,z 2 ) = e l9 (ci,c 2 ) 
near infinity, where c±,c 2 are constants with |ci| 2 + \c 2 \ 2 = 1. Thus, near 
infinity, 



j frd^i - (d^Zi) - (| Cl | 2 + \c 2 \ 2 )d6 = d9. (3.57) 
i=i 

We combine this observation with Stokes' theorem to obtain 
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Here, as in the previous section, T is a curve surrounding the origin at large 
distance. Now J d6 is the total change in 9 as we circle the boundary. While 
the phase e ld has to return to its original value after a round trip, the angle 
9 can increase by an integer multiple of 2n. The winding number §d9/2n 
can therefore be non-zero, but must be an integer. 

We have uncovered the rather surpring fact that the topological charge 
of the map n : S 2 — > S 2 is equal to the winding number of the phase angle 
9 at infinity. This is the topological constraint refered to earlier. As a 
byproduct, we have confirmed our conjecture that the topological charge N 
is an integer. The existence of this integer invariant shows that the smooth 
maps n : S 2 — > S 2 fall into distinct homotopy classes labeled by N. Maps 
with different values of N cannot be continuously deformed into one another, 
and, while we have not shown that it is so, two maps with the same value of 
N can be deformed into each other. 

Maps that can be continuously deformed one into the other are said to 
be homotopic. The set of homotopy classes of the maps of the n-sphere into 
a manifold M is denoted by n n (M). In the present case M = S 2 . We are 
therefore claiming that 

vr^S 2 ) = Z, (3.59) 
where we are identifying the homotopy class with its winding number N e Z. 

3.4.5 The Hopf index 

We have so far discussed maps from S 2 to S 2 . It is perhaps not too surprising 
that such maps are classified by a winding number. What is rather more 
surprising is that maps n : S 3 — > S 2 also have an associated topological 
number. If we continue to assume that n tends to a constant direction at 
infinity so that we can think of M 3 U {oo} as being S 3 , this number will label 
the homotopy classes n 3 (S 2 ) of fields of unit vectors n in three dimensions. 
We will think of the third dimension as being time. In this situation an 
interesting set of n fields to consider are the n(x, t) corresponding moving 
Skyrmions. The world lines of these Skyrmions will be tubes outside of which 
n is constant, and such that on any slice through the tube, n will cover the 
target n-sphere once. 

To motivate the formula we will find for the topological number, we begin 
with a problem from magnetostatics. Suppose we are given a cable originally 
made up of a bundle of many parallel wires. The cable is then twisted N 
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Figure 3.9: A twisted cable with N = 5. 

times about its axis and bent into a closed loop, the end of each individual 
wire being attached to its begining to make a continuous circuit. A current 
/ flows in the cable in such a manner that each individual wire carries only 
a small part 81^ of the total. The sense of the current is such that as we flow 
with it around the cable each wire wraps N times anticlockwise about all 
the others. The current produces a magnetic field B. Can we determine the 
integer twisting number N knowing only this B field? 
The answer is yes. We use Ampere's law in integral form, 



B • dr = (current encircled by T). 



(3.60) 



We also observe that the current density V x B = J at a point is directed 
along the tangent to the wire passing through that point. We therefore 
integrate along each individual wire as it encircles the others, and sum over 
the wires to find 



5L (h B ■ dri 



B • J d 3 x = / B • (V x B) d 3 x = NI 2 . (3.61) 



wires i 



We now apply this insight to our three-dimensional field of unit vectors n(x). 
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The quantity playing the role of the current density J is the topological cur- 
rent 

J a = l -e^t l0k n l d^d u n k . (3.62) 

We note that V • J = 0. This is simply another way of saying that the 2-form 
F = n*Q is closed. 

The flux of J through a surface S is 

/ J • dS = If (3.63) 
Js Js 

and this is the area of the spherical surface covered by the n's. A Skyrmion, 
for example, has total topological current / = An, the total surface area of 
the 2-sphere. The Skyrmion world-line will play the role of the cable, and 
the inverse images in IR 3 of points on S 2 correspond to the individual wires. 

If form language, the field corresponding to B can be any one-form A 
such that dA = F. Thus 



iV, 



Hopf 



will be an integer. This integer is the Hopf linking number, or Hopf index, 
and counts the number of times the Skyrmion twists before it bites its tail 
to form a closed-loop world-line. 

There is another way of obtaining this formula, and of understanding the 
number 167T 2 . We observe that the two-form F and the one-form A are the 
pull-back from S* 3 to IR 3 along ip of the forms 

1 2 

T = - > (dzidzi — dzidzi) , 



=i 

2 



A = t y~] {zidzi - Zidzi) , (3.65) 



=i 



respectively. If we substitute z 1}2 = £1,2 + ^1,2, we find that 

AF = 8(£idrjid£ 2 dri2 - VidZid&drfr + ^d^d^drn - 772^2^1^1) ■ (3.66) 

We know from exercise 3.2 that this expression is eight times the volume 
3-form on the three-sphere. Now the total volume of the unit three-sphere is 
27r 2 , and so, from our factored map x 1— > ip 1— > n we have that 

Nuo P i = -^[ r(AF) = ^-[ ^(Volume on S 5 ) (3.67) 
16n 2 J R 3 2n 2 J R3 
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is the number of times the normalized spinor ijj(x) covers S* 3 as x covers R 3 . 
For the Hopf map itself, this number is unity, and so the loop in S 3 which 
is the inverse image of a point in S 2 will twist once around any other such 
inverse image loop. 

We have now established that 



This result, implying that there are many maps from the three-sphere to 
the two-sphere that are not smoothly deformable to a constant map, was an 
great surprise when Hopf discovered it. 

One of the principal physics consequences of the existence of the Hopf 
index is that "quantum lump" quasi-particles like the Skyrmion can be 
fermions, even though they are described by commuting (and therefore bo- 
son) fields. To understand how this can be, we first explain that the collection 
of homotopy classes ir n (M) is not just a set. It has the additional structure 
of being a group: we can compose two homotopy classes to get a third, the 
composition is associative, and each homotopy class has an inverse. To define 
the group composition law, we think of S n as the interior of an n-dimensional 
cube with the map f : S n M taking a fixed value itlq e M at all points 
on the boundary of the cube. The boundary can then be considered to be a 
single point on S n . We then take one of the n dimensions as being "time" 
and place two cubes and their maps /i, / 2 into contact, with f 1 being "ear- 
lier" and / 2 being "later." We thus get a continuous map from a bigger box 
into M. The homotopy class of this map, after we relax the condition that 
the map takes the value mo on the common boundary, defines the composi- 
tion [f 2 ] o [fi] of the two homotopy classes corresponding to f\ and / 2 . The 
composition may be shown to be independent of the choice of representative 
functions in the two classes. The inverse of a homotopy class [/] is obtained 
by reversing the direction of "time" for each of the maps in the class. This 
group structure appears to depend on the fixed point itlq. As long as M 
is arcwise connected, however, the groups obtained from different mo's are 
isomorphic, or equivalent. In the case of n 2 (S 2 ) = Z and n 3 (S 2 ) = Z, the 
composition law is simply the addition of the integers iVeZ that label the 
classes. A full account of homotopy theory for working physicists is to be 
found in a readable review article by David Mermin. 3 

3 N. D. Mermin, "The topological theory of defects in ordered media." Rev. Mod. Phys. 
51 (1979) 591. 




(3.68) 
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When we quantize using Feynman's "sum over histories" path integral, we 
may multiply the contributions of histories / that are not deformable into 
one another by different phase factors exp{i0 ([/])}. The choice of phases 
must, however, be compatible with the composition of histories by concate- 
nating one after the other - essentially the same operation as composing 
homotopy classes. This means that the product exp{i0([/i]))} exp{i0([/ 2 ])} 
of the phase factors for two possible histories must be the phase factor 
exp{i0([/ 2 ] o [/i])} assigned to the composition of their homotopy classes. 
If our quantum system consists of spins n in two space and one time di- 
mension we can consistently assign a phase factor exp(i7riVHopf) to a history. 
The rotation of a single Skyrmion through 2it makes A^opf — 1 and so the 
wavefunction changes sign. We will show in the next section, that a his- 
tory where two particles change places can be continuously deformed into a 
history where they do not interchange, but instead one of them is twisted 
through 27T. The wavefunction of a pair of Skyrmions therefore changes sign 
when they are interchanged. This means that the quantized Skyrmion is a 
fermion. 



3.4.6 Twist and Writhe 

Consider two oriented non-intersecting closed curves 71 and 72 . We can use 
Ampere's law to count the number of times 71 encircles 72 by imagining that 
7 2 carries a unit current in the direction of its orientation, and evaluating 

Lk(7i,7 2 ) = j> B(ri) • dri 

1 I I ( r i - r 2) • x ^£2) ( 3 69 ) 



4tt / 7l J l2 |ri-r 2 | 



Here the second line follows from the first by an application of the Biot-Savart 
law to compute the B field due the current. The second line shows that the 
Gauss linking number Lk(7i, 72) is symmetric under the interchange 71 <-> 72 
of the two curves. It changes sign, however, if one of the curves changes 
orientation, or if the pair of curves is reflected in a mirror. 

Introduce parameters t±, t 2 with < £i,£ 2 < 1 to label points on the two 
curves. The curves are closed, so ri(0) = i"i(l), and similarly for r 2 . Let us 
also define a unit vector 
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Then 




(3.71) 



is seen to be (minus) the winding number of the map 

n : [0,1] x [0,1] -> S 2 . 



(3.72) 



of the 2-torus into the sphere. Our previous results on maps into the 2-sphere 
therefore confirm our Ampere-law intuition that Lk(7i, 72) is an integer. The 
linking number is also topological invariant, being unchanged under any de- 
formation of the curves that does not cause one to pass through the other. 

An important application of these ideas occurs in biology, where the 
curves are the two complementary strands of a closed loop of DNA. We can 
think of two such parallel curves as forming the edges of a ribbon {71, 72} of 
width e. Let use denote by 7 the curve r(i) running along the axis of the 
ribbon midway between 71 and 72. The unit tangent to 7 at the point r(t) is 



where the dots denote differentiation with respect to t. We also introduce a 
unit vector u(t) that is perpendicular to t(t) and lies in the ribbon, pointing 
from r 1 (t) to r 2 (t). 



t(t) 



r(f) 



(3.73) 



my 




t 



Figure 3.10: An oriented ribbon {71,72} showing the vectors t and u. 
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We will assign a common value of the parameter t to a point on 7 and the 
points nearest to r(t) on 71 and 72. Consequently 

n(i) = r(t)-^eu(t) 

r 2 (f) = r(f) + ^eu(f) (3.74) 

We can express u as 

u = u x u (3.75) 
for some angular- velocity vector u>(t). The quantity 

Tw = ^- j (w ■ t) dt (3.76) 

is called the Twist of the ribbon. It is not usually an integer, and is a 
property of the ribbon {71,72} itself, being independent of the choice of 
parameterization t. 

If we set 17 (t) and r 2 (t) equal to the single axis curve r(t) in the integrand 
of (3.69), the resulting "self-linking" integral, or Writhe, 

4ir/ 7 / 7 |r(t 1 )-r(i 2 )|3 1 ' 

remains convergent despite the factor of |r (ti) — r(t 2 )| 3 in the denominator. 
However, if we try to achieve this substitution by making the width of the 
ribbon e tend to zero, we find that the vector n(ti,t 2 ) abruptly reverses its 
direction as t\ passes t 2 . In the limit of infinitesimal width this violent motion 
provides a delta-function contribution 

-(w • t)<J(*i - t 2 ) dt! A dt 2 (3.78) 

to the 2-sphere area swept out by n, and this contribution is invisible to the 
Writhe integral. The Writhe is a property only of the overall shape of the 
axis curve 7, and is independent both of the ribbon that contains it, and of 
the choice of parameterization. The linking number, on the other hand, is 
independent of e, so the e — > limit of the linking-number integral is not the 
integral of the e — > limit of its integrand. Instead we have 

1 L 1 / / (r(^i) - r (^)) • (r(ti) x r(t 2 )) , , 
Lk( 7l , 72 ) = — <p (u ■ t) dt + — (p (p y yiJ , \ f ) ^ y -^dUdt 2 

(3.79) 



27r/ v 47T/ / |r(ti) — r(t 2 )| 
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This formula 



Lk = Tw + Wr 



(3.80) 



is known as the Calugareanu- White- Fuller relation, and is the basis for the 
claim, made in the previous section, that the worldline of an extended particle 
with an exchange (Wr = ±1) can be deformed into a worldline with a 2n 
rotation (Tw = ±1) without changing the topologically invariant linking 
number. 





Figure 3.11: Cutting and reassembling the domain of integration in (3.82) 



By setting 

n(ti,t 2 ) 
we can express the Writhe as 



Wr = -- 



47T 



11 



T 2 



r(*i) ~ r(t 2 ) 
|r(ti) - r(t 2 ) 



dn dn 

dti dt 2 



dt x dt 2 



(3.81) 



(3.82) 



but we must take care to recognize that this new n(t 1 ,t 2 ) is discontinuous 
across the line t — t% — t 2 . It is equal to t(t) for t\ infinitesimally larger 
than t 2 , and equal to —t(t) when t\ is infinitesimally smaller than t 2 . By 
cutting the square domain of integration and reassembling it into a rhom- 
boid, as shown in figure 3.11, we obtain a continuous integrand and see that 
the Writhe is (minus) the 2-sphere area (counted with multiplicies and di- 
vided by 47r) of a region whose boundary is composed of two curves T, the 
tangent indicatrix, or tantrix, on which n = t(t), and its oppositely oriented 
antipodal counterpart T' on which n = — t(i). 

The 2-sphere area O(T) bounded by T is only determined by T up to the 
addition of integer multiples of 4n. Taking note that the "wrong" orientation 
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of the boundary T (see figure 3.11 again) compensates for the minus sign 
before the integral in (3.82), we have 



4?rWr = 2fi(r) + Ann. 



(3.83) 



Thus, 



Wr = — n(T), modi. 



(3.84) 



We can do better than (3.84) once we realize that by allowing crossings we 
can continuously deform any closed curve into a perfect circle. Each self- 
crossing causes Lk and Wr (but not Tw which, being a local functional, does 
not care about crossings) to jump by ±2. For a perfect circle Wr = whilst 
Q = 2ir. We therefore have an improved estimate of the additive integer that 
is left undetermined by T, and from it we obtain 



Wr = 1 + — n(T), mod 2. 

Z71 



(3.85) 



This result is due to Brock Fuller. 4 

We can use our ribbon language to describe conformational transitions in 
long molecules. The elastic energy of a closed rod (or DNA molecule) can be 
approximated by 



E = J j^a(u,.t) 2 + ^ 2 j ds 



(3.86) 



Here we are parameterizing the curve by its arc- length s. The constant a is 
the torsional stiffness coefficient, (5 is the flexural stiffness, and 



«(s) 



d 2 r(s) 




dt(s) 


ds 2 




ds 



(3.87) 



is the local curvature. Suppose that our molecule has linking number n, i.e 
it was twisted n times before the ends were joined together to make a loop. 



4 F. Brock Fuller, Proc. Natl. Acad. Sci. USA, 75 (1978) 3557 - 61. 
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Figure 3.12: A molecule initially with Lk = 3, Tw = 3, Wr = writhes to a 
new configuration with Lk = 3, Tw = 0, Wr = 3. 

When (3 ^> a the molecule will minimize its bending energy by forming a 
planar circle with Wr and Tw ks n. If we increase a, or decrease f3, there 
will come a point at which the molecule will seek to save torsional energy at 
the expense of bending, and will suddenly writhe into a new configuration 
with Wr ps n and Tw ~ 0. Such twist-to-writhe transformations will be 
familiar to anyone who has struggled to coil a garden hose or electric cable. 



Exercise 3. 7: Old exam problem. A two-form is expressed in Cartesian coor- 
dinates as, 



Jp 

over the infinite plane P = {— oo < x < oo, — oo < y < oo, z = 1}. 
c) A sphere is embedded into M 3 by the map tp, which takes the point 
(9, (f>) G S 2 to the point (x, y, z) G M 3 , where 



3.5 Exercises and Problems 



= —(zdxdy + xdydz + ydzdx) 



where r = \J x 2 + y 2 + z 2 . 

a) Evaluate dw for r^O. 

b) Evaluate the integral 




x = R cos 4> sin 9 
y = R sin sin 9 
z = Rcos9. 



Pull back to and find the 2-form if*uj on the sphere. (Hint: The form 
lp*lo is both familiar and simple. If you end up with an intractable mess 
of trigonometric functions, you have made an algebraic error.) 
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d) By exploiting the result of part c), or otherwise, evaluate the integral 



where S 2 (R) is the surface of a two-sphere of radius R centered at the 



The following four exercises all explore the same geometric facts relating to 
Stokes' theorem and the area 2-form of a sphere, but in different physical 
settings. 

Exercise 3.8: A flywheel of moment of inertia / can rotate without friction 
about an axle whose direction is specified by a unit vector n. The flywheel and 
axle are initially stationary. The direction n of the axle is made to describe a 
simple closed curve 7 = d£l on the unit sphere, and is then left stationary. 



Show that once the axle has returned to rest in its initial direction, the flywheel 
has also returned to rest, but has rotated through an angle 6 = Area(J7) 
when compared with its initial orientation. The area of f2 is to be counted as 
positive if the path 7 surrounds it in a clockwise sense, and negative otherwise. 
Observe that the path 7 bounds two regions with opposite orientations. Taking 
into account that we cannot define the rotation angle at intermediate steps, 
show that the area of either region can be used to compute 6, the results 
being physically indistinguishable. (Hint: Show that the component Lz = 
+ (f> cos 9) of the flywheel's angular momentum along the axle is a constant 
of the motion.) 




uJ 



origin. 




Figure 3.13: Flywheel 
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Exercise 3.9: A ball of unit radius rolls without slipping on a table. The ball 
moves in such a way that the point in contact with table describes a closed 
path 7 = d£l on the ball. (The corresponding path on the table will not 
necessarily be closed.) Show that the final orientation of the ball will be such 
that it has rotated, when compared with its initial orientation, through an 
angle <fi = Area(O) about a vertical axis through its center, As in the previous 
problem, the area is counted positive if 7 encircles Vt in an anti-clockwise sense. 
(Hint: recall the no-slip rolling condition (ft + ipcosO = from (2.26).) 

Exercise 3.10: Let a curve in M 3 be parameterized by its arc length s as r(s). 
Then the unit tangent to the curve is given by 

, . dr 

The principal normal n(s) and the binormal b(s) are defined by the require- 
ment that t = /-en with the curvature k(s) positive, and that t, n and b = t x n 
form a right-handed orthonormal frame. 




Figure 3.14: Serret-Frenet frames. 

a) Show that there exists a scalar r(s), the torsion of the curve, such that 
t, n and b obey the Serret-Frenet relations 




b) Any pair of mutually orthogonal unit vectors ei(s), e2(s) perpendicular 
to t and such that ei x e2 = t can serve as an orthonormal frame for 
vectors in the normal plane. A basis pair ei, e2 with the property 



ei • e 2 — e 2 • ei = 
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is said to be parallel, or Fermi-Walker, transported along the curve. In 
other words, a parallel-transported 3-frame t, ei, e2 slides along the 
curve r(s) in such a way that the component of its angular velocity in 
the t direction is always zero. Show that the Serret-Frenet frame ei = n, 
e2 = b is not parallel transported, but instead rotates at angular velocity 
6 = t with respect to a parallel-transported frame, 
c) Consider a finite segment of curve such that the initial and final Serret- 
Frenet frames are parallel, and so t(s) defines a closed path 7 = dfl 
on the unit sphere. Fill in the line-by-line justications for the following 
sequence of manipulations: 



(The line marked V is the one that requires most thought. How can we 
define "b" and "n" in the interior of O?) 
d) Conclude that a Fermi- Walker transported frame will have rotated through 
an angle 6 = Area(O), compared to its initial orientation, by the time it 
reaches the end of the curve. 

The plane of transversely polarized light propagating in a monomode optical 
fibre is Fermi- Walker transported, and this rotation can be studied experimen- 



Exercise 3.11: Foucault's pendulum (in disguise). A particle of mass m is 
constrained by a pair of frictionless plates to move in a plane II that passes 
through the origin O. The particle is attracted to O by a force — kt, and it 
therefore executes simple harmonic motion within II. The orientation of the 

5 A. Tomita, R. Y. Chao, Phys. Rev. Lett. 57 (1986) 937-940. 




-Area(fi). 



tally. 5 
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plane, specified by a normal vector n, can be altered in such a way that II 
continues to pass through the centre of attraction O. 

a) Show that the constrained motion is described by the equation 

mr + kt = A(i)n, 

and determine X(t) in terms of m, n and r. 

b) Seek a solution in the form 

r(i) = A(i) cos(ujt + 4>), 

and, by assuming that n changes direction slowly compared to the fre- 
quency 00 = y/n/m, show that A = — n(n • A). Deduce that | A| remains 
constant, and so A = u x A for some angular velocity vector u. Show 
that u is perpendicular to n. 

c) Show that the results of part b) imply that the direction of oscillation A 
is "parallel transported" in the sense of the previous problem. Conclude 
that if n slowly describes a closed loop 7 = d£l on the unit sphere, 
then the direction of oscillation A ends up rotated through an angle 
9 = Area(ft). 

The next exercise introduces an clever trick for solving some of the non-linear 
partial differential equations of field theory. The class of equations to which 
it and its generalizations are applicable is rather restricted, but when they 
work they provide a complete multi-soliton solution. 

Problem 3.12: In this problem you will find the spin field n(x) that minimizes 
the energy functional 

E[n] = \l (iVn 1 ) 2 + |Vn 2 | 2 + |Vn 3 | 2 ) dx l dx 2 

for a given positive winding number N . 

a) Use the results of exercise 3.6 to write the winding number TV, defined 
in (3.35), and the energy functional E[n] as 

47riV = / (i + g 2 4 + V 2 } 2 (^2?? - ftiT&fl dx'dx 2 , 

E ^ = \ j (1+C 2 4 +r/2) 2 ((^lO 2 + {U? + {d lV f + {d 2V f) dx'dx 2 , 

where £ and rj are stereographic co-ordinates on S 2 specifying the direc- 
tion of the unit vector n. 
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b) Deduce the inequality 

E-4xN=\] (l - g2 4 + + idM + in)? dx'dx 2 > 0. 

c) Deduce that for winding number N > the minimum energy solutions 
have energy E = 4-7T./V and are obtained by solving the first-order linear 
equation 



d) Solve the equation in part c) and show that the minimal energy solutions 
with winding number N > are given by 

(z - 6i) . . . (z - Oat) 

where z = x 1 + ix 2 , and A, ai, . . . , ajy, and bi,...,b^, are arbitrary 
complex numbers — except that no a may coincide with any b. This is 
the solution we displayed at the end of section 3.4.2. 

e) Repeat the analysis for N < 0. Show that the solutions are given in 
terms of rational functions of z = x 1 — ix 2 . 

The idea of combining the energy functional and the topological charge into a 
single, manifestly positive, functional is due to Evgueny Bogomol'nyi. The the 
resulting first order linear equation is therefore called a Bogomolnyi equation. 
If we had tried to find a solution directly in terms of n, we would have ended 
up with a horribly non- linear second-order partial differential equation.. 



Exercise 3.13: Lobachevski space. The hyperbolic plane of Lobachevski ge- 
ometry can be realized by embedding the Z > R branch of the two-sheeted 
hyperboloid Z 2 — X 2 — Y 2 = R 2 into a Minkowski space with metric ds 2 = 
-dZ 2 + dX 2 + dY 2 . 

We can parametrize the emebedded surface by making an "imaginary radius" 
version of the stereographic map, in which the point P on the hyperboloid is 
labelled by the co-ordinates of the point Q on the X-Y plane (see figure 3.15). 

i) Show that the embedding induces the metric 

9 ( ' ) = ( R 2_ A x 2 -Y 2 ) 2 {dX ® dX + dY ® dY ^ X 2 + Y 2 < R 2 
of the Poincare disc model (see problem ??.??) on the hyperboloid. 
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Figure 3.15: A slice through the embedding of two-dimensional Lobachevski 
space into three-dimensional Minkowski space, showing the sterographic pa- 
rameterization of the embedded space by the Poincare disc X 2 + Y 2 <R 2 . 



ii) Use the induced metric to show that the area of a disc of hyperbolic 
radius p is given by 

? 2 s i-nh 2 ( JL\ = 9^f?2 



Area = 47rirsinh / [■£= = 2vrir (cosh(p/i?) - 1), 

and so is only given by irp 2 when p is small compared to the scale R of 
the hyperbolic space. It suffices to consider circles with their centres at 
the origin. You will first need to show that the hyperbolic distance p 
from the center of the disc to a point at Euclidean distance r is 

' R + r N 



p = Rhx 



R 



Exercise 3.14: Faraday's "flux rule" for computing the electromotive force £ 
in a circuit containing a thin moving wire is usually derived by the following 
manipulations: 

£ = f (E + v x B) • dr 

Jan 



/ curlE • dS- <L B • (v x dr) 

Jn Jan 

- [ ^ • dS - I B • (v x dr) 



d_ 

dt 



B-dS. 
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a) Show that if we parameterize the surface f2 as x^(u,v,t), with u,v la- 
belling points on Q and r parametrizing the evolution of f2, then the 
corresponding manipulations in the covariant differential-form version of 
Maxwell's equations lead to 



f F= f C V F= f i v F = - f f 
Jn Jn Jon Jon 



where = dx^/dr and / = —iyF. 
b) Show that if we take r to be the proper time along the world-line of each 
element of Q, then V is the 4- velocity 



v 



and / = —iyF becomes the one-form corresponding to the Lorentz-force 
4- vector. 

It is not clear that the terms in this covariant form of Farday's law can be 
given any physical interpretation outside the low- velocity limit. When parts 
of d£l have different velocities, the relation of the integrals to measurements 
made at fixed co-ordinate time requires thought. 6 

The next pair of exercises explores some physics appearances of the contin- 
uum Hopf linking number (3.64). 

Exercise 3.15: The equations governing the motion of an incompressible in- 
viscid fluid are V • v = and Euler's equation 

Dv dv 

- S - + (v.V)v = -VP. 

Recall that the operator d/dt + v • V, here written as D/Dt, is called the 

convective derivative. 

a) Take the curl of Euler's equation to show that if cj = V x v is the vorticity 
then 

_ s _ + (v .V)u, = («.V)v. 

b) Combine Euler's equation with part a) to show that 

^(v-) = V.L(V- 



6 See E. Marx, Journal of the Franklin Institute, 300 (1975) 353-364. 
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c) Show that if Q is a volume moving with the fluid, then 

4- I f(r,t)dV= [ ^-dV. 
dt Jn K ' JnDt 

e) Conclude that when u) is zero at infinity the helicity 
1 = Jv(Vxv)dV = J v-udV 

is a constant of the motion. 

The helicity measures the Hopf linking number of the vortex lines. The dis- 
covery 7 of its conservation founded the field of topological fluid dynamics. 

Exercise 3.16: Let B = V x A and E = —dA/dt — V(j) be the electric and 
magnetic field in an incompressible and perfectly conducting fluid. In such a 
fluid the co-moving electromotive force E+ v x B must vanish everywhere. 

a) Use Maxwell's equations to show that 

dA 

— = v x (V x A) - V0, 
<9B 

- = Vx(vxB). 

b) From part a) show that the convective derivative of A • B is given by 

^(A.B)=V-{B(A-v-0)}. 

c) By using the same reasoning as the previous problem, and assuming that 
B is zero at infinity, conclude that Woltjer's invariant 

1 = J (A • B) dV = J t ijk A^A k d z x = j AF 

is a constant of the motion. 

This result shows that the Hopf linking number of the magnetic field lines is 
independent of time. It is an essential ingredient in the geodynamo theory of 
the Earth's magnetic field. 



7 H. K. Moffatt, J. Fluid Mech. 35 (1969) 117. 
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Chapter 4 

An Introduction to Topology 



Topology is the study of the consequences of continuity. We all know that 
a continuous real function defined on a connected interval and positive at 
one point and negative at another must take the value zero at some point 
between. This fact seems obvious — although a course of real analysis will 
convince you of the need for a proof. A less obvious fact, but one that 
follows from the previous one, is that a continuous function defined on the 
unit circle must posses two diametrically opposite points at which it takes the 
same value. To see that this is so, consider f(8 + ir) — f(9). This difference 
(if not initially zero, in which case there is nothing further to prove) changes 
sign as 9 is advanced through n, because the two terms exchange roles. It was 
therefore zero somewhere. This observation has practical application in daily 
life: Our local coffee shop contains four-legged tables that wobble because 
the floor is not level. They are round tables, however, and because they 
possess no misguided levelling screws all four legs have the same length. We 
are therefore guaranteed that by rotating the table about its center through 
an angle of less than n/2 we will find a stable location. A ninety-degree 
rotation interchanges the pair of legs that are both on the ground with the 
pair that are rocking, and at the change-over point all four legs must be 
simultaneously on the ground. 

Similar effects with a practical significance for physics appear when we 
try to extend our vector and tensor calculus from a local region to an entire 
manifold. A smooth field of vectors tangent to the sphere S 2 will always 
possess a zero — i.e. a point at which the the vector field vanishes. On 
the torus T 2 , however, we can construct a nowhere- zero vector field. This 
shows that the global topology of the manifold influences the way in which 
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the tangent spaces are glued together to form the tangent bundle. To study 
this influence in a systematic manner we need first to understand how to 
characterize the global structure of a manifold, and then to see how this 
structure affects the mathematical and physical objects that live on it. 



4.1 Homeomorphism and Diffeomorphism 

In the previous chapter we met with a number of topological invariants, 
quantities that are unaffected by continuous deformations. Some invariants 
help to distinguish topologically distinct manifolds. An important example is 
the set of Betti numbers of the manifold. If two manifolds have different Betti 
numbers they are certainly distinct. If, however, they have the same Betti 
numbers, we cannot be sure that they are topologically identical. It is a holy 
grail of topology to find a complete set of invariants such that having them 
all coincide would be enough to say that two manifolds were topologically 
the same. 

In the previous paragraph we were deliberately vague in our use of the 
terms "distinct" and the "same". Two topological spaces (spaces equipped 
with a definition of what is to be considered an open set) are regarded as be- 
ing the "same", or homeomorphic, if there is a one-to-one, onto, continuous 
map between them whose inverse is also continuous. Manifolds come with the 
additional structure of differentiability: we may therefore talk of "smooth" 
maps, meaning that their expression in coordinates is infinitely (C°°) differ- 
entiable. We regard two manifolds as being the "same", or diffeomorphic, if 
there is a one-to-one onto C°° map between them whose inverse is also C°°. 
The distinction between homeomorphism and diffeomorphism sounds like a 
mere technical nicety, but it has consequences for physics. Edward Witten 
discovered 1 that there are 992 distinct 11-spheres. These are manifolds that 
are all homeomorphic to the 11-sphere, but diffeomorphically inequivalent. 
This fact is crucial for the cancellation of global graviational anomalies in 
the E 8 x E 8 or SO (32) symmetric superstring theories. 

Since we are interested in the consequences of topology for calculus, we 
will restrict ourselves to the interpretation "same" = diffeomorphic. 



*E. Witten, Comm. Math. Phys. 117 (1986), 197. 
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4.2 Cohomology 

Betti numbers arise in answer to what seems like a simple calculus problem: 
when can a vector field whose divergence vanishes be written as the curl of 
something? We will see that the answer depends on the global structure of 
the space the field inhabits. 



4.2.1 Retractable Spaces: Converse of Poincare Lemma 

Poincare's lemma asserts that d 2 = 0. In traditional vector calculus language 
this reduces to the statements curl(grad0) = and div(curlw) = 0. We 
often assume that the converse is true: If curlv = 0, we expect that we can 
find a <fi such that v = grad0, and, if divv = 0, that we can find a w such 
that v = curl w. You know a formula for the first case: 

<f,{x) = [ vdx, (4.1) 



xo 



but probably do not know the corresponding formula for w. Using differ- 
ential forms, and provided the space in which these forms live has suitable 
topological properties, it is straightforward to find a solution for the general 
problem: If uj is closed, meaning that duo = 0, find \ such that u = d\- 

The "suitable topological properties" referred to in the previous para- 
graph is that the space be retractable. Suppose that the closed form uj is 
defined in a domain f2. We say that Vl is retractable to the point O if there 
exists a smooth map ip t : Q — > Q which depends continuously on a parameter 
t G [0, 1] and for which tfi(x) = x and tfo(x) = O. Applying this retraction 
map to the form, we will then have tp\uj = uj and ip^u = 0. Let us set 
(ft(x^) = x^(t). Define rj(x, t) to be the velocity-vector field that corresponds 
to the co-ordinate flow: 

^ = !/•(*,*)• (4-2) 

An easy exercise, using the interpretation of the Lie derivative in (2.40), 
shows that 

|(y^)=£»). (4.3) 

We now use the infinitesimal homotopy relation and our assumption that 
duj = 0, and hence (from exercise 3.3) that d(ip* t uj) = 0, to write 

C v ((f* t uj) = (i v d + di v )((p*uj) = d[i v ((p*uj)]. (4.4) 



118 



CHAPTER 4. AN INTRODUCTION TO TOPOLOGY 



Thus 



Using this we can integrate up with respect to t to find 

uj = ip\uj — ip* Q u = d i v ((p1uj)dt^j . (4.5) 

X = f t v (^»dt, (4.6) 
Jo 

solves our problem. 

This magic formula for \ makes use of the nearly all the "calculus on 
manifolds" concepts that we have introduced so far. The notation is so pow- 
erful that it has suppressed nearly everything that a traditionally-educated 
physicist would find familiar. We will therefore unpack the symbols by means 
of a concrete example. Let us take il to be the whole of 1R 3 . This can be 
retracted to the origin via the map (p t (x^) = x^(t) = tx^. The velocity field 
whose flow gives 

x"(t) =ta?(0) 
is rj^(x,t) = x^/t. To verify this, compute 

^=^(0) = >), 

so x^(t) is indeed the solution to 

dx* 



= ^(x(t),t). 



dt 

Now let us apply this retraction to uj = Adydz + Bdzdx + Cdxdy with 

fdA dB dC\ J J J n 
d.= (- + - + -)dxdyd Z = 0. (4.7) 

The pull-back ip* t gives 

ip* t w = A(tx, ty , tz)d(ty)d(tz) + (two similar terms). (4.8) 



The interior product with 
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then gives 

i v <PtUi = tA(tx,ty,tz)(y dz — zdy) + (two similar terms). 
Finally we form the ordinary integral over t to get 



(4.10) 



X 



ir)((plu)dt 



A(tx, ty, tz)t dt 



(ydz — zdy) 



+ 
+ 



B(tx, ty, tz)t dt 



C(tx, ty, tz)t dt 



(zdx — xdz) 
(xdy — ydx). 



(4.11) 



In this expression the integrals in the square brackets are just numerical 
coefficients, i.e., the u dt" is not part of the one-form. It is instructive, 
because not entirely trivial, to let "gP act on x an d verify that the con- 
struction works. If we focus first on the term involving A, we find that 
d[f Q A(tx,ty,tz)t dt](ydz — zdy) can be grouped as 



, , dA dA dA 
It A + t z x— + y— + z— 

ox ay oz 



dt 



dydz 



f 1 dA 

/ t 2 ——dt(xdydz + ydzdx + zdxdy). (4-12) 
Jo ox 



The first of these terms is equal to 



1 



i ^ 

— {t 2 A(tx,ty,tz)} dt 
o dt 



dydz = A(x, y, x) dydz, 



(4.13) 



which is part of u. The second term will combine with the terms involving 
B, C, to become 



. 1 2 (dA OB 0C\ , . , , , , , , . 
— I t I — — h — — h J dt [xdydz + ydzdx + zdxdy), 



(4.14) 



which is zero by our hypothesis. Putting togther the A, B, C, terms does 
therefore reconstitute uj. 
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4.2.2 Obstructions to Exactness 

The condition that be retractable plays an essential role in the converse to 
Poincare's lemma. In its absence du = does not guarantee that there is an 
X such that uj = d\- Consider, for example, a vector field v with curlv = 
in an annulus fl = {Ro < |r| < Ri}. In the annulus (a non- retractable space) 
the condition that curl v = does not prohibit j> r v • dr being non zero for 
some closed path T encircling the central hole. When this line integral is 
non-zero then there can be no single-valued x such that v = V%. If there 
were such a Xi then 

jfvdr = x(0)-x(0) = 0. (4.15) 

A non- zero value for j> r v • dr therefore consititutes an obstruction to the 
existence of an <p such that v = V%. 

Example: The sphere S 2 is not retractable. The area 2-form sm8d8d<j) is 
closed, but, although we can write 

wn6d9d<f> = d[(l-coae)d<f)], (4.16) 

the 1-form (1 — cos6)d<f) is singular at the south pole, 6 = ir. We could try 

sm6d6d(f) = d[(-l- cos 6)d(j)}, (4.17) 

but this is singular at the north pole, 9 — 0. There is no escape: we know 
that 

/ sin 0<Z0d# = 4tt, (4.18) 
Js 2 

but if sin 6d6d<j) = dx then Stokes says that 

/ sm8d0d<f>= [ x = (4.19) 

JS 2 JdS 2 

because dS 2 = 0. Again, a non-zero value for J u over some boundary-less 
region has provided an obstruction to finding an x such that uj = dx- 

4.2.3 De Rham Cohomology 

We have seen that sometimes the condition du = allows us to find an x such 
that uj = dx, and sometimes it does not. If the region in which we seek x is 
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retractable, we can always construct it. If the region is not retractable there 
may be an obstruction to the existence of x- 111 order to describe the various 
possibilities we introduce the language of cohomology , or more precisely de 
Rham cohomology, named for the Swiss mathematician Georges de Rham 
who did the most to create it. 

The significance of cohomology for physics is that many important quan- 
tities can be expressed as integrals of differential forms that lie in some co- 
homology space. 

For simplicity suppose that we are working in a compact manifold M 
without boundary. Let Q P (M) = /\ P (T*M) be the space of all smooth p-form 
fields. It is a vector space over R: we can add p-form fields and multiply them 
by real constants, but, as is the vector space C°°(M) of smooth functions on 
M, it is infinite dimensional. The subspace Z P (M) of closed forms — those 
with diu = — is also an infinite dimensional vector space, and the same 
is true of the space B P (M) of exact forms — those that can be written as 
uj = dx for some globally defined (p — l)-form x- Now consider the space 
H p = Z p /B p , which is the space of closed forms modulo exact forms. In this 
space we do not distinguish between two forms, lo\ and uj 2 when there an x, 
such that lji = uj 2 + dx- We say that lo\ and uj 2 are cohomologous, and write 
oj\ ~ uj 2 G H P (M). We will use the symbol [cu] to denote the equivalence 
class of forms cohomologous to u. Now a miracle happens! For a compact 
manifold M the space H P (M) is finite dimensional! It is called the p-th (de 
Rham) cohomology space of the manifold, and depends only on the global 
topology of M. In particular, it does not depend on any metric we may have 
chosen for M. 

Sometimes we write if£ R (M, R) to make clear that we are dealing with 
de Rham cohomolgy, and that we are working with vector spaces over the 
real numbers. This is because there is also a space H^ R (M, Z), where we 
only allow multiplication by integers. 

The cohomology space H^ R (M, R) codifies all potential obstructions to 
solving the problem of finding a (p — l)-form x such that dx = oo: we can 
find such a x if an d only if u> is cohomologous to zero in H^ R (M, R). If 
i?Q R (M, R) = {0}, which is the case if M is retractable, then all closed p- 
forms are cohomologous to zero. If H^ R (M,M.) ^ {0}, then some closed 
p-forms uo will not be cohomologous to zero. We can test whether w~0G 
H^ R (M, R) by forming suitable integrals. 
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4.3 Homology 

The language of cohomology seems rather abstract. To understand its origin 
it may be more intuitive to think about the spaces that are the cohomology 
spaces' vector-space duals. These homology spaces are simple to understand 
pictorially. 

The basic idea is that, given a region Q, we can find its boundary dfl. 
Inspection of a few simple cases will soon lead to the conclusion that the 
"boundary of a boundary" consists of nothing. In symbols, d 2 = 0. The 
statement "<9 2 = 0" is clearly analgous to u d 2 = 0," and, pursuing the anal- 
ogy, we can construct a vector space of "regions" and define two "regions" 
as being homologous if they differ by the boundary of another "region." 

4.3.1 Chains, Cycles and Boundaries 

We begin by making precise the vague notions of region and boundary. 

Simplicial Complexes 

The set of all curves and surfaces in a manifold M is infinite dimensional, but 
the homology spaces are finite dimensional. Life would be much easier if we 
could use finite dimensional spaces throughout. Mathematicians therefore 
do what any computationally-minded physicist would do: they approximate 
the smooth manifold by a discrete polygonal grid . Were they interested in 
distances, they would necessarily use many small polygons so as to obtain 
a good approximation to the detailed shape of the manifold. The global 
topology, though, can often be captured by a rather coarse discretization. 
The result of this process is to reduce a complicated problem in differential 
geometry to one of simple algebra. The resulting theory is therefore known 
as algebraic topology. 

It turns out to be convenient to approximate the manifold by generalized 
triangles. We therefore dissect M into line segments (if one dimensional), 
triangles, (if two dimensional), tetrahedra (if three dimensional) or higher 
dimensional p-simplices (singular: simplex). The rules for the dissection are: 

a) Every point must belong to at least one simplex. 

b) A point can belong to only a finite number of simplices. 

c) Two different simplices either have no points in common, or 

i) one is a face (or edge, or vertex) of the other, 
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Figure 4.1: Triangles, or 2-simplices, that are a) allowed, b) not allowed in a 
dissection. In b) only parts of edges are in common. 




Figure 4.2: A triangulation of the 2-torus. a) The torus as a rectangle 
with periodic boundary conditions: The two edges labled a will be glued 
togther point-by-point along the arrows when we reassemble the torus, and 
so are to be regarded as a single edge. The two sides labeled [3 will be glued 
similarly, b) The assembled torus: All four P's are now in the same place, 
and correspond to a single point. 

ii) the set of points in common is the whole of a shared face (or edge, 
or vertex). 

The collection of simplices composing the dissected space is called a simplicial 
complex. We will denote it by S. 

We may not need many triangles to capture the global topology. For 
example, figure 4.2 shows how a two-dimensional torus can be decomposed 
into two 2-simplices (triangles) bounded by three 1-simplices (edges) a,/3,j, 
and with only a single 0-simplex (vertex) P. Computations are easier to 
describe, however, if each simplex in the decomposition is uniquely specified 
by its vertices. For this we usually need a slightly finer dissection. Figure 
4.3 shows a decomposition of the torus into 18 triangles each of which is 
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Pi ?2 P 3 




Figure 4.4: A tetrahedral triangulation of the 2-sphere. The circulating 
arrows on the faces indicate the choice of orientation P1P2P4 and P2P3P4. 



uniquely labeled by three points drawn from a set of nine vertices. In this 
figure vertices with identical labels are to be regarded as the same vertex, 
as are the corresponding sides of triangles. Thus, each of the edges P1P2, 
P2P3, P3P1, at the top of the figure are to be glued point-by-point to the 
corresponding edges on bottom of the figure. Similarly along the sides. The 
resulting simplicial complex then has 27 edges. 

We may triangulate the sphere S 2 as a tetrahedron with vertices Pi, P2, 
P3, P4. This dissection has six edges: P1P2, P1P3, P1P4, P2P3, P2P4, P3P4, 
and four faces: P 2 P 3 P 4 , P1P3P4, P1P2P4 and PiP 2 P 3 . 
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p- Chains 

We assign to simplices an orientation defined by the order in which we write 
their defining vertices. The interchange of of any pair of vertices reverses the 
orientation, and we consider there to be a relative minus sign between oppo- 
sitely oriented but otherwise identical simplices: P2P1P3P4 = —PiPiP^Pi- 

We now construct abstract vector spaces C P (S, R) of p-chains which have 
the oriented p-simplices as their basis vectors. The most general elements of 
C2(S, R), with S being the tetrahedral triangulation of the sphere S 2 , would 
be 

ai P 2 P 3 P 4 + a 2 P 1 P 3 P 4 + a 3 P 1 P 2 P 4 + a 4 PiP 2 P 3 , (4.20) 

where ai,...,a 4 , are real numbers. We regard the distinct faces as being 
linearly independent basis elements for C 2 (S, R). The space is therefore four 
dimensional. If we had triangulated the sphere so that it had 16 triangular 
faces, the space C 2 would be 16 dimensional. 

Similarly, the general element of Ci(S,M.) would be 

&1P1P2 + &2P1P3 + b 3 P 1 P i + 64P2P3 + b 5 P 2 P 4 + 6 6 P 3 P 4 , (4.21) 

and so Ci(S, R) is a six- dimensional space spanned by the edges of the tetra- 
hedron. For C (S, R) we have 

C1P1 + c 2 P 2 + C3P3 + C4P4, (4.22) 

and so Co(S,M.) is four dimensional, and spanned by the vertices. 

Our manifold comprises only the surface of the two-sphere, so there is no 
such thing as C 3 (S,M). 

The reason for making the field R explicit in these definitions is that we 
sometimes gain more information about the topology if we allow only integer 
coefficients. The space of such p-chains is then denoted by C P (S,Z). Be- 
cause a vector space requires that coefficients be drawn from a field, these 
objects are no longer vector spaces. They can be thought of as either mod- 
ules — "vector spaces" whose coefficient are drawn from a ring — or as additive 
abelian groups. 
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P 2 P 3 



Figure 4.5: The oriented triangle P2P3P4 has boundary P3P4 + P4P2 + P 2 Pz- 



The Boundary Operator 

We now introduce a linear map d p : C v — > C p -i, called the boundary operator. 
Its action on a p-simplex is 

p+i 

d p P n P l2 ■ ■ ■ P lp+1 = ^(-iy+i Pil . . . P; . . . P ip+1 , (4.23) 

where the "hat" indicates that Pj. is to be omitted. The resulting (p — 1)- 
chain is called the boundary of the simplex. For example 

<9 2 (P 2 P 3 P 4 ) = P3P4-P2P4 + P2P3, 

= P3P4 + P 4 P 2 + P 2 P 3 . (4.24) 

The boundary of a line segment is the difference of its endpoints 

d 1 (P 1 P 2 ) = P 2 -P 1 . (4.25) 

Finally, for any point, 

dPi = 0. (4.26) 

Because d is defined to be a linear map, when it is applied to a p-chain 
c = aiSj + 02^2 + ■ • • + a n s n , where the s« are p-simplices, we have d p c = 
aid p si + a 2 d p s 2 H h a n d p s n . 

When we take the "<9" of a chain of compatibly oriented simplices that to- 
gether make up some region, the internal boundaries cancel in pairs, and 
the "boundary" of the chain really is the oriented geometric boundary of the 
region. For example in figure 4.6 we find that 

d(P 1 P 5 P 2 + P2P5P4 + P3P4P5 + PP3P5) = P1P3 + P3P4 + P4P2 + P2P1 , (4.27) 
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Figure 4.6: Compatibly oriented simplices. 

which is the counter-clockwise directed boundary of the square. 

For each of the examples we find that d p -\d p s = 0. From the definition 
(4.23) we can easily establish that this identity holds for any p-simplex s. As 
chains are sums of simplices and d p is linear, it remains true for any c G C p . 
Thus <9p_i<9p = 0. We will usually abbreviate this statement as d 2 = 0. 



Cycles, Boundaries and Homology 

A chain complex is a doubly infinite sequence of spaces (these can be vector 
spaces, modules, abelian groups, or many other mathematical objects) such 
as ... , C_ 2 , C_i, C , Ci, C 2 ■ ■ ., together with structure-preserving maps 

. . . ^ C p % C p ^ ^ C p _ 2 ^ . . . , (4.28) 

with the property that d p -id p = 0. The finite sequence of C p s we constructed 
from our simplicial complex is an example of a chain complex where C p is 
zero-dimensional for p < or p > d. Chain complexes are a useful tool in 
mathematics, and the ideas we explain in this section have many applications. 

Given any chain complex we can define two important linear subspaces 
of each of the C p s. The first is the space Z p of p-cycles. This consists of 
those z G C p such that d p z = 0. The second is the space B p of p -boundaries, 
and consists of those b G C p such that b = d p+ ic for some c G C p+ i. Because 
d 2 = 0, the boundaries B p constitute a subspace of Z p . From these spaces 
we form the quotient space H p = Z p /B p , consisting of equivalence classes of 
p-cycles, where we deem z% and z 2 to be equivalent, or homologous, if they 
differ by a boundary: z 2 = z\ + dc. We will write the equivalence class of 
cycles homologous Zi to as [zj\. The space H p , or more accurately, H P (WL), is 
called the p-th (simplicial) homology space of the chain complex. It becomes 
the p-th homology group if R is replaced by the integers. 
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We can construct these homology spaces for any chain complex. When 
the chain complex is derived from a simplicial complex decomposition of a 
manifold M a remarkable thing happens. The spaces Cp^ Z p ^ and B p ^ all 
depend on the details of how the manifold M has been dissected to form 
the simplicial complex S. The homology space H p , however, is independent 
the dissection. This is neither obvious nor easy to prove. We will rely on 
examples to make it plausible. Granted this independence, we will write 
H P (M), or H P (M, R), so as to make it clear that H p is a property of M. The 
dimension b p of H p (M) is called the p-th Betti number of the manifold: 

b p = dim H P (M). (4.29) 

Example: The Two-Sphere. For the tetrahedral dissection of the two-sphere, 
any vertex is Pj homologous to any other, as P — Pj = <9(P,P) and all 
PjPi belong to C^. Furthermore, <9Pj = 0, so H (S 2 ) is one dimensional. 
In general, the dimension of Hq(M) is the number of disconnected pieces 
making up M. We will write H (S 2 ) = M, regarding R as the archetype of a 
one-dimensional vector space. 

Now let us consider Hi(S 2 ). We first find the space of 1-cycles Z x . An 
element of C\ will be in Z\ only if each vertex that is the begining of an edge 
is also the end of an edge, and that these edges have the same coefficient. 
Thus 

zi = ^ 2 P 3 + P3P4 + P4P2 

is a cycle, as is 

Z2 = PlPl + P4P2 + P 2 Pl- 

These are both boundaries of faces of the tetrahedron. It should be fairly 
easy to convince yourself that Z\ is the space of linear combinations of these 
together with boundaries of the other faces 

Z 3 = P1P4 + P4P3 + P3P, 
Z4 = P1P3 + P3P2 + P2P. 

Any three of these are linearly independent, and so Z\ is three dimensional. 
Because all of the cycles are boundaries, every element of Z\ is homologous 
to 0, and so H^S 2 ) = {0}. 

We also see that H 2 (S 2 ) = R. Here the basis element is 



P 2 P 3 P 4 - P!P 3 P 4 + P1P2P4 " PP2P3 



(4.30) 
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which is the 2-chain corresponding to the entire surface of the sphere. It 
would be the boundary of the solid tedrahedron, but does not count as a 
boundary as the interior of the tetrahedron is not part of the simplicial 
complex. 

Example: The Torus. Consider the 2-torus T 2 .We will see that H (T 2 ) = R, 
Hi{T 2 ) = R 2 = R © E, and H 2 {T 2 ) = R. A natural basis for the two- 
dimensional Hi(T 2 ) consists of the 1-cycles a, j3 portrayed in figure 4.7. 




a 



Figure 4.7: A basis of 1-cycles on the 2-torus. 



The cycle 7 that, in figure 4.2, winds once around the torus is homologous 
to a + /3. In terms of the second triangulation of the torus (figure 4.3) we 
would have 

a = P 1 P 2 + P 2 P Z + P Z P 1 

f3 = P 1 P 7 + P 7 P A + P A P 1 (4.31) 

and 

7 = P l P 8 + P 8 P 6 + P 6 Pi 

= a + (3 + d(P 1 P s P 2 + P 8 P 9 P 2 + P 2 P 9 P 3 + ---). (4.32) 

Example: The Projective Plane. The projective plane RP 2 can be regarded 
as a rectangle with diametrically opposite points identified. Suppose we 
decompose RP 2 into eight triangles, as in figure 4.8. 
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?3 



?2 



P4 



P4 



?2 



Figure 4.8: A triangulation of the projective plane. 



Consider the "entire surface' 



a = P 1 P 2 P 5 + PtP 5 P 4 + • • • e C 2 (RP 2 ) 



(4.33) 



consisting of the sum of all eight 2-simplices with the orientation indicated 
in the figure. Let a = P1P2 + P2P3 and (3 = P1P4 + -P4-P3 be the sides of the 
rectangle running along the bottom horizontal and left vertical sides of the 
figure, respectively. In each case they run from Pi to P 3 . Then 



d(a) = P 1 P 2 + P 2 P 3 + P 3 P 4 + P i P 1 + P 1 P 2 + P 2 P 3 + P 3 P i + P 1 P 2 



Although IRP 2 has no actual edge that we can fall off, from the homological 
viewpoint it does have a boundary! This represents the conflict between local 
orientation of each of the 2-simplices and the global non-orient ability of IRP 2 . 
The surface o~ of IRP 2 is not a two-cycle, therefore. Indeed Z 2 (IRP 2 ), and a 
fortiori H 2 (RP 2 ), contain only the zero vector. The only one-cycle is a — (3 
which runs from Pi to Pi via P 2 , P 3 and P 4 , but (4.34) shows that this is 



the boundary of \a. Thus H 2 (RP 2 ,R) = {0} and P^MP 2 ,^) = {0}, while 
H (RP 2 ,R)=R. 



We can now see the advantage of restricting ourselves to integer coeffi- 
cients. When we are not allowed fractions, the cycle 7 = (a — (3) is no longer 
a boundary, although 2(a — (3) is the boundary of a. Thus, using the symbol 
Z 2 to denote the additive group of the integers modulo two, we can write 
Pi(lRP 2 ,Z) = Z 2 . This homology space is a set with only two members 
{07,17}. The finite group Pi (IRP 2 , Z) = Z 2 is said to be the torsion part 
of the homology — a confusing terminology because this torsion has nothing 
to do with the torsion tensor of Riemannian geometry. 



2(a-/3) ^ 0. 



(4.34) 
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We introduced real- number homology first, because the theory of vector 
spaces is simpler than that of modules, and more familiar to physicists. The 
torsion is, however, invisible to the real-number homology. We were therefore 
buying a simplification at the expense of throwing away information. 

The Euler Character 

The sum 

d 

X (M) d ^ £(-l)*dimif p (M,R) (4.35) 

is called the Euler character of the manifold M. For example, the 2-sphere 
has x(S 2 ) — 2, the projective plane has x(^-P 2 ) = 1, and the n-torus has 
x{T n ) = 0. This number is manifestly a topological invariant because the 
individual dim H P (M,M) are. We will show that that the Euler character is 
also equal to V — E + F — ■ • • where V is the number of vertices, E is the 
number of edges and F is the number of faces in the simplicial dissection. The 
dots are for higher dimensional spaces, where the alternating sum continues 
with (— l) p times the number of p-simplices. In other words, we are claiming 
that 

d 

X (M) ^(-l)* dim C P (M). (4.36) 

It is not so obvious that this new sum is a topological invariant. The indi- 
vidual dimensions of the spaces of p-chains depend on the details of how we 
dissect M into simplices. If our claim is to be correct, the dependence must 
somehow drop out when we take the alternating sum. 

A useful tool for working with alternating sums of vector-space dimen- 
sions is provided by the notion of an exact sequence. We say that a set 
of vector spaces V p with maps f p : V p — > V p+ i is an exact sequence if 
Ker (f p ) = Im(/ p _ 1 ). For example, if all cycles were boundaries then the 
set of spaces C p with the maps d p taking us from C p to C p _i would consi- 
tute an exact sequence — albeit with p decreasing rather than increasing, but 
this is irrelevent. When the homology is non-zero, however, we only have 
Im (fp-i) C Ker (f p ), and the number dim H p = dim (Ker f p ) — dim (Im f p -i) 
provides a measure of how far this set inclusion falls short of being an equal- 
ity. 
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Suppose that 

{0}Al/ 1 A7 2 A...t 1 y n A {0} (4.37) 

is a finite-length exact sequence. Here, {0} is the vector space containing 
only the zero vector. Being linear, f maps to 0. Also f n maps everything 
in V n to 0. Since this last map takes everything to zero, and what is mapped 
to zero is the image of the penultimate map, we have V n = Im / n _i. Similarly, 
the fact that Ker/x = Im/ = {0} shows that Im/i C V<i is an isomorphic 
image of V\. This situation is represented pictorially in figure 4.9. 




Now the range-nullspace theorem tells us that 

dim V p = dim (Im f p ) + dim (Ker f p ) 

= dim(Im/p) +dim(Im/ p _i). (4.38) 

When we take the alternating sum of the dimensions, and use dim (Im / ) = 
and dim (Im/„) = 0, we find that the sum telescopes to give 

n 

^(-l) p dim^ p = 0. (4.39) 

The vanishing of this alternating sum is one of the principal properties of an 
exact sequence. 

Now, for our sequence of spaces C p with the maps d p : C p — » C p _i, we have 
dim (Ker c^) = dim (Imd p+ i) + dim .Hp. Using this and the range-nullspace 
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theorem in the same manner as above, shows that 

d d 

£(-l) p dim C P (M) = ^(-l)Mimtf p (M). (4.40) 

p=0 p=0 

This confirms our claim. 

Exercise 4.1: Count the number of vertices, edges, and faces in the triangu- 
lation we used to compute the homology groups of the real projective plane 
RP 2 . Verify that V — E + F = 1, and that this is the same number that we 
get by evaluating 

X (RP 2 ) = dim H (RP 2 , R) - dim H x (RP 2 , R) + dim H 2 (RP 2 , R) . 
Exercise 4.2: Show that the sequence 

{0} V ^ W {0} 

of vector spaces being exact means that the map <p : V — > W is one-to-one 
and onto, and hence an isomorphism V = W. 

Exercise 4.3: Show that a s/iori exact sequence 

of vector spaces is just a sophisticated way of asserting that C = B/A. More 
precisely, show that the map i is injective (one-to-one), so A can be considered 
to be a subspacc of B. Then show that the map tt is surjective (onto), and 
can be regarded as projecting B onto the equivalence classes B/A. 

Exercise 4.4: Let a : A —>■ B be a linear map. Show that 

{Oj^Kero A A ^ B ^ Cokero -► {0} 

is an exact sequence. (Recall that Cokero = B/Ima.) 

4.3.2 Relative homology 

Mathematicians have invented powerful tools for computing homology. In 
this section we introduce one of them: the exact sequence of a pair. We 
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describe this tool in detail because a homotopy analogue of this exact se- 
quence is used in physics to classify defects such as dislocations, vortices and 
monopoles. Homotopy theory is however harder and requires more technical 
apparatus than homology so the ideas are easier to explain here. 

We have seen that it is useful to think of complicated manifolds as being 
assembled out of simpler ones. We constructed the torus, for example, by 
gluing together edges of a rectangle. Another construction technique involves 
shrinking parts of a manifold to a point. Think, for example, of the unit 2- 
disc as a being circle of cloth with a drawstring sewn into its boundary. Now 
pull the string tight to form a spherical bag. The continuous functions on 
the resulting 2-sphere are those continuous functions on the disc that took 
the same value at all points on its boundary. Recall that we used this idea in 
3.4.2, where we claimed that those spin textures in M. 2 that point in a fixed 
direction at infinity can be thought of as spin textures on the 2-sphere. We 
now extend this shrinking trick to homology. 

Suppose that we have a chain complex consisting of spaces C p and bound- 
ary operations d p . We wiill denote this chain complex by (C,d). Another 
set of of spaces and boundary operations (C, &) is a subcomplex of (C, d) if 
each C' p C C p and d' p (c) = d p (c) for each c G C . This situation arises if we 
have a simplical complex S and a some subset S' that is itself a simplicial 
complex, and take C' p = C P (S') 

Since each C' p is subspace of C p we can form the quotient spaces C p /C' p 
and make them into a chain complex by defining, for c + C' p G C p /C' p , 

d p (c + C' P )=d p c + C' P -v (4-41) 
It easy to see that this operation is well defined (i.e. it gives the same output 
independent of the choice of representative in the equivalence class c + C'), 
that d p : C p — > C p _i is a linear map, and that <9 p _i<9 p = 0. We have 
constructed a new chain complex (C/C',d). We can therefore form its ho- 
mology spaces in the usual way. The resulting vector space, or abelian group, 
H P (C /C) is the p-th relative homology group of C modulo C. When C and 
C arise from simplicial complexes S' C S, these spaces are what remains of 
the homology of S after every chain in S' has been shrunk to a point. In 
this case, it is customary to write H p (S,S') instead of H p (C/C), and simi- 
larly write the chain, cycle and boundary spaces as C P (S, S'), Z p (S, S') and 
B P (S, S') respectively. 

Example: Constructing the two-sphere S 2 from the two-ball (or disc) B 2 . 
We regard B 2 to be the triangular simplex P1P2P3, and its boundary, the 
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one-sphere or circle S 1 , to be the simplicial complex containing the points Pi, 
P 2 , P 3 and the sides P1P2, P2P3, P3P1, but not the interior of the triangle. 
We wish to contract this boundary complex to a point, and form the relative 
chain complexes and their homology spaces. Of the spaces we quotient by, 
C (S 1 ) is spanned by the points Pi, P 2 , P3, the 1-chain space C^S 1 ) is 
spanned by the sides PiP 2 , P 2 P 3 , P3P1, while C 2 (S r ) = {0}. The space of 
relative chains C 2 (B 1 , S* 1 ) consists of multiples of PiP 2 P 3 + C^S 1 ), and the 
boundary 

9 2 (PiP 2 P 3 + C^S 1 )^ = (P 2 P 3 + P3P1 + PiP 2 ) + diS 1 ) (4.42) 

is equivalent to zero because P 2 P 3 + P 3 Pi + P X P 2 G C^S 1 ). Thus PiP 2 P 3 + 
C2(S 1 ) is a non-bounding cycle and spans H 2 (B 2 ,S r ), which is therefore 
one dimensional. This space is isomorphic to the one-dimensional H 2 (S 2 ). 
Similarly Hi(B 2 , S 1 ) is zero dimensional, and so isomorphic to Hi(S 2 ). This 
is because all chains in Ci(B 2 , S 1 ) are in C\(S 1 ) and therefore equivalent to 
zero. 

A peculiarity, however, is that Hq(B 2 , S 1 ) is not isomorphic to Hq(S 2 ) = 
R. Instead, we find that Hq(B 2 , S 1 ) = {0} because all the points are equiva- 
lent to zero. This vanishing is characteristic of the zeroth relative homology 
space H (S,S') for the simplicial triangulation of any connected manifold. 
It occurs because S being connected means that any point P in S can be 
reached by walking along edges from any other point, in particular from a 
point P' in S'. This makes P homologous to P', and so equivalent to to zero 
mH (S,S'). 



Exact homology sequence of a pair 

Homological algebra is full of miracles. Here we describe one of them. From 
the ingredients we have at hand, we can construct a semi-infinite sequence 
of spaces and linear maps between them 

...^ H P (S')^ H P (S) ^ H P (S,S')^ 

^H (S')^ H {S) ^ H (S, S') ^ {0}. (4.43) 
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The maps i* p and 7r* p are induced by the natural injection i p : C P (S') — > C P (S) 
and projection tt p : C P (S) — > C P (S)/C P (S'). It is only necessary to check that 

ip_i<9 p = 9p? p , (4.44) 

to see that they are compatible with the passage from the chain spaces to 
the homology spaces. More discussion is required of the connection map 
that takes us from one row to the next in the displayed form of (4.43). 

Let h G H P (S, S'), then h = z + B p (S, S') for some cycle z G Z(S, S'), and 
in turn z = c + C P (S') for some c G C p (S). (So two choices of representative 
of equivalence class are being made here.) Now d p z = which means that 
d p c G Cp-i(S'). This fact, when combined with d p _id p = 0, tells us that 
d p c G Z p _i(S'). We now set 

d* p (h) = d p c + Bp_ 1 (S'). (4.45) 

This sounds rather involved, but let's say it again in words: an element of 
H P (S, S') is a relative p-cycle modulo S'. This means that its boundary is 
not necessarily zero, but may be a non-zero element of C p -i(S'). Since this 
element is the boundary of something its own boundary vanishes, so it is 
(p — l)-cycle in C p _i(<S") and hence a representative of a homology class in 
H p -i(S'). This homology class is the output of the d* p map. 

The miracle is that the sequence of maps (4.43) is exact. It is an example 
of a standard homological algebra construction of a long exact sequence out 
of a family of short exact sequences, in this case out the sequences 

{0} - Cp(S') - C P (S) - C P (S, S') ^ {0}. (4.46) 

Proving that the long sequence is exact is straightforward. All one must do 
is check each map to see that it has the properties required. This exercise in 
diagram chasing is left to the reader. 

This long exact sequence is called the exact homology sequence of a pair. 
If we know that certain homology spaces are zero dimensional, it provides a 
powerful tool for computing other spaces in the sequence. As an illustration, 
consider the sequence of the pair B n+1 and S n for n > 0: 

• • • H p (B n+1 ) ^ H p {B n+ \S n ) ^ Hp^S 71 ) 
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H p ^{B n+ \S n ) ^ H p _ 2 (S n ) 



H^B"* 1 ^") H {S n ) 

= R 

H (B n+1 ,S n ) ^{0}. (4.47) 

We have inserted here the easily established data that H p (B n+1 ) = {0} for 
p > (which is a consequence of the (n+ l)-ball being a contractible space), 
and that H (B n+1 ) and H (S n ) are one dimensional because they consist of 
a single connected component. We read off, from the {0} — > A — > 5 — > {0} 
exact subsequences, the isomorphisms 

if p (fl^S w )^iZ p _ 1 (S' n ), p>l, (4.48) 

and from the exact sequence 

{0} -> Hi(B n+1 , S 1 ) -> R -> R -> H (B n+1 , S n ) -> {0} (4.49) 

that Hi(B n+1 , S n ) = {0} = H (B n+1 , S n ). The first of these equalities holds 
because Hi(B n+1 ,S n ) is the kernel of the isomorphism R — > R, and the 
second because Ho(B n+1 , S n ) is the range of a surjective null map. 

In the case n = 0, we have to modify our last conclusion because H (S°) = 
R © R is two dimensional. (Remember that H (M) counts the number of 
disconnected components of M, and the zero-sphere S° consists of the two 
disconnected points Pi, P 2 lying in the boundary of the interval B 1 = P1P2.) 
As a consequence, the last five maps become 

{0} -> H^B 1 , S°) -> R © R -> R -> #o(£\ 5°) -> {0}. (4.50) 

This tells us that H^B 1 , S°) = R and H (B 1 , S°) = {0}. 

Exact homotopy sequence of a pair 

We have met the homotopy groups n n (M) in section 3.4.4. As we saw there, 
homotopy groups can be used to classify defects or solitons in physical sys- 
tems in which some field takes values in the manifold M. When the system 




H 1 (B n+1 ) m> 
H (B n+1 ) ^ 
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has undergone spontaneous symmetry breaking from a larger symmetry G 
to a subgroup H, the relevant manifold is the coset G/H. The group 7i n {G) 
can be taken to be the set of continuous maps of an n-dimensional cube into 
G, with the surface of the cube mapping to the identity element e e G. We 
similarly define the relative homotopy group n n (G, H) of G modulo H to be 
the set of continuous maps of the cube into G, with all-but-one face of the 
cube mapping to e, but with the remaining face mapping to the subgroup H. 
It can then be shown that ir n (G/H) ~ n n (G,H) (the hard part is to show 
that any continuous map into G/H can be represented as the projection of 
some continuous map into G). 
The short exact sequence 

{e} -> H A G A G/if -> {e} (4.51) 

of group homomorphisms (where {e} is the group consisting only of the 
identity element) then gives rise to the long exact sequence 

► 7r n (H) - n n (G) - n n (G, H) - 7r n _i(if) - ■ ■ ■ (4.52) 

The derivation and utility of this exact sequence is very well described in the 
review article by Mermin cited in section 3.4.4. We have therefore contented 
ourselves with simply displaying the result so that the reader can see the 
similarity between the homology theorem and its homotopy-theory analogue. 

4.4 De Rham's Theorem 

We still have not related homology to cohomology. The link is provided by 
integration. 

The integral provides a natural pairing of a p-chain c and a p-form uj: if 
c = a\S\ + a 2 S2 + • • • + a n s n , where the Sj are simplices, we set 

(c, u) = ^^ai I uj. (4.53) 

The perhaps mysterious notion of "adding" geometric simplices is thus given 
a concrete interpretation in terms of adding real numbers. 
Stokes' theorem now reads 



((9c, uj) = (c, cLj), 



(4.54) 
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suggesting that d and d should be regarded as adjoints of each other. From 
this observation follows the key fact that the pairing between chains and 
forms descends to a pairing between homology classes and cohomology classes. 
In other words, 

(z + dc,cu + dx) = (z, u), (4.55) 

so it does not matter which representative of the equivalence classes we take 
when we compute the integral. Let us see why this is so: 
Suppose z G Z p and uj 2 = iO\ + dr). Then 



(z, u 2 ) = / uj 2 = / wi + dr] 

J z J z J z 



dz 

= {z,u>i) (4.56) 

because dz — 0. Thus, all elements of the cohomology class of oo return the 
same answer when integrated over a cycle. 
Similarly, if uj G Z v and c 2 = C\ + da then 



(c 2 ,uj) = CO + CO 

'ci J da 



J ci J a 

-L 



duj 



'ci 

= (ci,w), 

since du = 0. 

All this means that we can consider the equivalence classes of closed forms 
composing H^ R (M) to be elements of (H P (M))*, the dual space of H P (M) 
- hence the "co" in cohomology. The existence of the pairing does not 
automatically mean that H^ R is the dual space to H P (M), however, because 
there might be elements of the dual space that are not in H^ R , and there 
might be distinct elements of H^ R that give identical answers when integrated 
over any cycle, and so correspond to the same element in (H p (M))*. This 
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does not happen, however, when the manifold is compact: De Rham showed 
that, for compact manifolds, (H P (M,R))* = H^ R (M,R). We will not try to 
prove this, but be satisfied with some examples. 

The statement (H P (M))* = if£ R (M) neatly summarizes de Rham's re- 
sults, but, in practice, the more explicit statements below are more useful. 

Theorem: (de Rham) Suppose that M is a compact manifold. 

1) A closed p-form u is exact if and only if 

uj = (4.57) 

for all cycles z% G Z p . It suffices to check this for one representative of 
each homology class. 

2) If Zi G Z p , i = 1, . . . , dim H p , is a basis for the p-th homology space, 
and ctj a set of numbers, one for each Zj, then there exists a closed 
p-form uj such that 

uj = aj. (4.58) 
If uj 1 constitute a basis of the vector space H V (M) then the matrix of numbers 

W = (zi,<J)= I co j (4.59) 

is called the period matrix, and the fV themselves are the periods. 
Example: Hi(T 2 ) = R © R is two-dimensional. Since a finite-dimensional 
vector space and its dual have the same dimension, de Rham tells us that 
iJp R (T 2 ) is also two-dimensional. If we take as coordinates on T 2 the angles 
9 and <j), then the basis elements, or generators, of the cohomology spaces are 
the forms "d6" and "c?0". We have inserted the quotes to stress that these 
expressions are not the d of a function. The angles 9 and are not functions 
on the torus, since they are not single-valued. The homology basis 1-cycles 
can be taken as zg running from 9 = to 9 = 2n along <fi = n, and z^ running 
from = to = 2tt along 9 = ir. Clearly, uj = aed9/2ii + a^dtp^n returns 
f zg uj = otQ and J uj = for any ctg, a n , so {d9 /2n, d<p/2ii} and {zg, z^} are 
dual bases. 

Example: We have earlier computed H 2 (RP 2 ,R) = {0} and #i(ILP 2 ,M) = 
{0}. De Rham therefore tells us that H 2 (RP 2 ,R) = {0} and H^RP 2 ^) = 
{0}. From this we deduce that all closed one- and two- forms on the projective 
plane RP 2 are exact. 
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Example: As an illustration of de Rham part 1), observe that it is easy to 
show that a closed one-forms d> can be written as df, provided that f = 
for all cycles. We simply define / = f x 0, and observe that the proviso 
ensures that / is not multivalued. 

Example: A more subtle problem is to show that, given a two- form u on S 2 , 
with f S 2 uj = 0, then there is a globally defined x sucn that u> = d\- We 
begin by covering S 2 by two open sets D + and Z)_ which have the form of 
caps such that D + includes all of S 2 except for a neighbourhood of the south 
pole, while D_ includes everything except a neighbourhood of the north pole, 
and the intersection, D + PI D_, has the topology of an annulus, or cingulum, 
encircling the equator. 




Figure 4.10: A covering the sphere by two contractable caps. 

Since both D + and D_ are contractable, there are one-forms x+ an d X- snch 
that uj = dx+ in D + and u = dx~ in D_. Thus, 

d( X +-X-) = Q, in D + HD_. (4.60) 

Dividing the sphere into two disjoint sets with a common (but oppositely 
oriented) boundary T G D + fl D- we have 

0= I oo= <f(x+-X-), (4-61) 

and this is true for any such curve T. Thus, by the previous example, 

( j ) =( x+ - x _)=df (4.62) 

for some smooth function / defined in D+HD-. We now introduce a partition 
of unity subordinate to the cover of S 2 by D + and _D_ . This partition is a 
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pair of non-negative smooth functions, p±, such that p + is non-zero only in 
D + , p_ is non-zero only in _D_, and p + + p- = 1. Now 

f = p+f- (-/>_)/, (4.63) 

and /_ = p + f is a function defined everywhere on D-. Similarly /+ = 
(-p )f is a function on D + . Notice the interchange of ± labels! This is not 
a mistake. The function / is not defined outside D + n £L, but we can define 
p_/ everywhere on D + because / gets multiplied by zero wherever we have 
no specific value to assign to it. 
We now observe that 

X+ + df+ = x- + df-, in D + n -D_. (4.64) 

Thus a; = where x is defined everywhere by the rule 

[X- + df-, m D_. 

It does not matter which definition we take in the cingular region D + fl D_, 
because the two definitions coincide there. 

The methods of this example, a special case of the Mayer- Vietoris prin- 
ciple, can be extended to give a proof of de Rham's claims. 

4.5 Poincare Duality 

De Rham's theorem does not require that our manifold M be orientable. Our 
next results do, however, require orientablity. We therefore assume through- 
out this section that M is a compact, orientable, D-dimensional manifold. 

We begin with the observation that if the forms lo\ and uj 2 are closed then 
so is iO\ A UJ2- Furthermore if one or both of U\, u>2 is exact then the product 
ujx /\uo 2 is also exact. It follows that the cohomology class [u\ Auj 2 ] of U\ Auj 2 
depends only on the cohomology classes [u\] and [u 2 ]. The wedge product 
thus induces a map 

H P (M, R) x H q (M, R) A H p+q (M, R), (4.66) 

which is called the "cup product" of the cohomology classes. It is written 

as 

[wiAw 2 ] = M U H, (4.67) 
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and gives the cohomology the structure of a graded-commutative ring, de- 
noted by H'(M, R) 

More significant for us than the ring structure is that, given oo G H D (M, R), 
we can obtain a real number by forming f M uj (This is the point at which 
we need orient ability. We only know how to integrate over orientable chains, 
and so cannot even define J m uj when M is not orientable.) and can com- 
bine this integral with the cup product to make any cohomology class [/] G 
H D ~ P (M, R) into an element F of (H P (M,R))*. We do this by setting 

F(\9])= [ f*9 (4.68) 

J M 

for each [g] G H P (M, R). Furthermore, it is possible to show that we can 
get any element F of (H P (M, R))* in this way, and the corresponding [/] is 
unique. But de Rham has already given us a way of identifying the elements 
of (H P (M,R))* with the cycles in H P (M,R)\ There is, therefore, a 1-1 onto 
map 

H p (M, R) <-> H D ~ P (M, R). (4.69) 
In particular the dimensions of these two spaces must coincide 

b p (M) = b D _ p {M). (4.70) 

This equality of Betti numbers is called Poincare duality. Poincare originally 
conceived of it geometrically. His idea was to construct from each simplicial 
triangulation S of M a new "dual" triangulation S', where, in two dimensions 
for example, we place a new vertex at the centre of each triangle, and join the 
vertices by lines through each side of the old triangles to make new cells - 
each new cell containing one of the old vertices. If we are lucky, this process 
will have the effect of replacing each p-simplex by a (D — p)-simplex, and so 
set up a map between C P (S) and C D ^ P (S') that turns the homolgy "upside 
down." The new cells are not always simplices, however, and it is hard to 
make this construction systematic. Poincare's original recipe was flawed. 

Our present approach to Poincare's result is asserting that for each basis 
p-cycle class [zf] there is a unique (up to cohomology) (D — p)-form ojf _p 
such that 

/ /= / ^ P A/. (4.71) 

We can construct this uif~ p "physically" by taking a representative cycle zf 
in the homology class [zf] and thinking of it as a surface with a conserved 
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unit (d — p)-form current flowing in its vicinity. An example would be the 
two-form topological current running along the one-dimensional worldline of 
a Skyrmion. (See the discussion surrounding equation (3.63).) The u)f~ p 
form a basis for H D ~ P (M, R). We can therefore expand / ~ pujf~ p , and 
similarly for the closed p-form g, to obtain 

f ;/,' / ./V /(/../) (4.72) 

J M 

where the matrix 

I{i i j)=I{2$,$-*)= [ W f-*A^ (4.73) 

J M 

is called the intersection form. From the definition we have 

/(z,j) = (-l) p(D - p) /(j,0- (4-74) 

Less obvious is that is an integer that reports the number of times 

(counted with orientation) that the cycles zf and z®~ p intersect. This latter 
fact can be understood from our construction of the uf as unit currents 
localized near the z i ~ p cycles. The integrand in (4.73) is non-zero only in the 
neighbourhood of the intersections of z p with z®~ p , and at each intersection 
constitutes a D-iorm that integrates up to give ±1. 
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Figure 4.11: The intersection of two cycles: I(a,{3) 



1 + 1. 



This claim is illustrated in the left-hand part of figure 4.11, which shows a 
region surrounding the intersection of the a and /3 one-cycles on the 2-torus. 
The co-ordinate system has been chosen so that the a cycle runs along the 
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x axis and the (3 cycle along then y axis. Each cycle is surrounded by the 
narrow shaded regions — w < y < w and — w < x < w, respectively. To 
construct suitable forms u a and ujp we select a smooth function f(x) that 
vanishes for |x| > w and such that J f dx — 1. In the local chart we can then 
set 



v a = f(y) dy, 
up = -f(x) dx, 

both these forms being closed. The intersection number is given by the 
integral 

I(a, /3) = j u) a A cop = J J f(x)f(y) dxdy = 1. (4.75) 

The right-hand part of figure 4.11 illustrates why this intersection number 
depends only on the homology classes of the two one-cycles, and not on their 
particular instantiation as curves. 

We can more conveniently re-express (4.72) terms of the periods of the 
forms 

fi = f f = Hi, k)f\ g s = f g = l)g\ (4.76) 

Jzf Jz°- p 



as 



where 



/ fAg = J2K^j) [ f [ 9, (4.77) 



K(ij)=r\i,k)r 1 (j,i)i(k,i) = r 1 (j,i) (4.78) 



is the transpose of the inverse of the intersection-form matrix. The decom- 
position (4.77) of the integral of the product of a pair of closed forms into 
a bilinear form in their periods is one of the two principal results of this 
section, the other being (4.70). 

In simple cases we can obtain the decomposition (4.77) by more direct 
methods. Suppose, for example, that we label the cycles generating the 
homology group Hi(T 2 ) of the 2-torus as a and (3, and that a and b are 
closed (da = db = 0), but not necessarily exact, one-forms. We will show 
that 

I aAb= f a f b- f b f a. (4.79) 

Jt 2 Ja J/3 Ja J/3 
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To do this, we cut the torus along the cycles a and (3 and open it out into 
a rectangle with sides of length L x and L y . The cycles a and (3 will form 
the sides of the rectangle and we will take them as lying parallel to the x 
and y axes, respectively. Functions on the torus now become functions on 
the rectangle. Not all functions on the rectangle descend from functions on 
the torus, however. Only those functions that satisfy the periodic bound- 
ary conditions f(0,y) = f(L x ,y) and /(x,0) = f(x,L y ) can be considered 
(mathematicians would say "can be lifted" ) to be functions on the torus. 



a 




a 1 » 

a 



Figure 4.12: Cut-open torus 



Since the rectangle (but not the torus) is retractable, we can write a = df 
where / is a function on the rectangle — but not necessarily a function on 
the torus, i.e., f will not, in general, be periodic. Since a A b = d(fb), we 
can now use Stokes' theorem to evaluate 

I aAb= [ d{fb) = [ fb. (4.80) 

JT 2 JT 2 JdT 2 

The two integrals on the two vertical sides of the rectangle can be combined 
to a single integral over the points of the one-cycle (3: 

f fb= [ [f(L x ,y)-f(0,y)]b. (4.81) 

J vertical J /3 

We now observe that [f(L x ,y) — f(0,y)} is a constant, and so can be taken 
out of the integral. It is a constant because all paths from the point (0, y) to 
(L x , y) are homologous to the one-cycle a, so the difference f(L x , y) — /(0, y) 
is equal to f a. Thus 



f [f(L x ,y)-f(0,y)]b= f a fb. 

J p Ja J fl 



(4.82) 
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Similarly, the contributions of the two horizontal sides is 

f [f(x, 0) - L v )]b = - [a lb. (4.83) 

J a J (3 J a 

On putting the contributions of both pairs of sides together, the claimed 
result follows. 



4.6 Characteristic Classes 

A supply of elements of H 2m (M, R) and H 2m (M, Z) is provided by the charac- 
teristic classes associated with connections on vector bundles over the man- 
ifold M. 

Recall that connections appear in covariant derivatives 

^, = 8, + A„, (4.84) 

and are to be thought of as matrix-valued one-forms A = A^dx^. In the 
quantum mechanics of charged particles the covariant derivative that appears 
in the Schrodinger equation is 

V, = A - ieA^\ (4.85) 



Here e is the charge of the particle on whose wavefunction the derivative acts, 
and yl Maxwe11 is the usual electromagnetic vector potential. The matrix-valued 
connection one-form is therefore 

A = -ieAf^daf. (4.86) 

In this case the matrix is one-by-one. 

In a non-abelian gauge theory with gauge group G the connection becomes 

A = iXaA'dx" (4.87) 

The A a are hermitian matrices that have commutation relations [A a , A J = 
ifabXc, where the f^ b are the structure constants of the Lie algebra of G. The 
A a therefore form a representation of the Lie algebra, and this representation 
plays the role of the "charge" of the non-abelian gauge particle. 
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For covariant derivatives acting on a tangent vector field f a e a on a Rie- 
mann n-manifold, where the e a are an orthonormal vielbein frame, we have 



where, for each //, the coefficients uj a b^ = —^ba^ can be thought of as the 
entries in a skew symmetric n-by-n matrix. These matrices are elements of 
the Lie algebra o(n) of 0(n). 

In all these cases we define the curvature two-form to be F = dA + A 2 , 
where a combined matrix and wedge product is to be understood in A 2 . 
In exercises 2.19 and 2.20 you used the Bianchi identity to show that the 
gauge-invariant 2n-forms tr (F n ) were closed. The integrals of these forms 
over cycles provide numbers that are topological invariants of the bundle. 
For example, in four- dimensional QCD, the integral 



over a compactified four-dimensional manifold £7 is an integer that a math- 
ematician would call the second Chern number of the non-abelian gauge 
bundle, and that a physicist would call the instanton number of the gauge 
field configuration. 

In this section we will show that the integrals of such characteristic classes 
are indeed topological invariants. We also explain something of what these 
invariants are measuring, and illustrate why, when suitably normalized, cer- 
tain of them are integer valued. 

4.6.1 Topological invariance 

Suppose that we have been given a connection A and slightly deform it 



A = uj abtl dx tJ -, 



(4.88) 




(4.89) 



A -> A + SA, then 

5F = d{5A) + 5AA + A 5 A. 
Using the Bianchi identity dF = FA — AF, we find that 



(4.90) 



5tr(F n ) 



ratr^FF"" 1 ) 

n tr(d(8A)F n ~ 1 ) + n tr(<L4 AF n_1 ) + n tr(A 5AF n ~ l ) 
n tr(d(8A)F n ~ 1 ) + n tr(<L4 AF™- 1 ) - n tr(<L4 F n ~ l A) 
djntr^AF"- 1 )} . ( 



(4.91) 



4.6. CHARACTERISTIC CLASSES 



149 



The last line of (4.91) is equal to the penultimate line because all but the first 
and last terms arising from the gLF's in d {tr(<L4 F n ~ r )} cancel in pairs. A 
globally defined change in A therefore changes tr(F n ) by the d of something, 
and so does not change its cohomology class, or its integral over a cycle. 

At first sight, this invariance under deformation suggests that all the 
tr(F n ) are exact forms — they can apparently all be written as tr(F n ) = 
duj2n-i{A) for some (2n— l)-form U2 n -i(A). To find u)2n-i(A) all we have to 
do is deform the connection to zero by setting A t = t A and 

F t = dA t + A 2 t =tdA + t 2 A 2 . (4.92) 

Then 5A t = A5t, and 

±tr(Fn = d{ntr(AFr 1 )}. (4.93) 
Integrating up from t — 0, we find 

tr(F n ) = d jn jf tr(AF , t n_1 ) dij . (4.94) 

For example 

tr(F 2 ) = d^2 J tr(A(tdA + t 2 A 2 )dt^ 

= d^tr^AdA + ^A^Y (4.95) 

You should recognize here the 003(A) = tr(AdA + 1^4 3 ) Chern-Simons form of 
exercise 2.19. The naive conclusion — that all the tr(F n ) are exact — is false, 
however. What the computation actually shows is that when J tr(F n ) ^ 
we cannot find a globally defined one-form A representing the connection or 
gauge field. With no global A, we cannot globally deform A to zero. 

Consider, for example, an Abelian U(l) gauge field on the two- sphere S 2 . 
When the first Chern-number 




(4.96) 



is non-zero, there can be no globally defined one-form A such that F = 
dA. Glance back, however, at figure 4.10 on page 141. There we see that 
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the retractability of the spherical caps D± guarantees that there are one- 
forms A± defined on D± such that F = dA± in D±. In the cingular region 
D + fl D- where they are both defined, A + and A_ will be related by a gauge 
transformation. For a U(l) gauge field, the matrix g appearing in the general 
gauge transformation rule 

A ^ A 9 = g -i Ag + g -i dgj (4.97) 

of exercise 2.20 becomes the phase e %x G U(l). Consequently 

A + = A_ + e~ ix de ix = A_ + id X in D + f]D_. (4.98) 

The U(l) group element e %x is required to be single valued in D + fl D_, but 
the angle x ma Y be multivalued. We now write ci as the sum of integrals over 
the north and south hemispheres of S 2 , and use Stokes theorem to reduce 
this sum to a single integral over the hemispheres' common boundary, the 
equator T. 

C = ±f F + ±f F 

2^ ./north 27T? J sont h 



J north * n b J south 

I dA + + ^—.f dA^ 

./north ^ % -'south 

f 1 f 



1 

2ni 

— f A + - — I A_ 
2iri J r 2iri J T 



= lj^ X (4.99) 

We see that Ci is the integer counting the winding of x as we circle T. An 
integer cannot be continuously reduced to zero, and if we attempt to deform 
A — > tA — > 0, we will violate the required single- valuedness of the U(l) group 
element e ix . 

Although the Chern-Simons forms uo2 n -i{A) cannot be defined globally, 
they are still very useful in physics. They occur as Wess-Zumino terms 
describing the low energy properties of various quantum field theories, the 
prototype being the Skyrme-Witten model of Hadrons. 2 



2 E. Witten, Nucl. Phys. B223 (1983) 422; ibid. B223 (1983) 433. 
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4.6.2 Chern characters and Chern classes 

Any gauge-invariant polynomial (with exterior multiplication of forms un- 
derstood) in F provides a closed, topologically invariant, differential form. 
Certain combinations, however, have additional desirable properties, and so 
have been given names. 
The form 

cK{F) = tI {h(^ F )°} (4100) 

is called the n-th Chern character. It is convenient to think of this 2n-form 
as being the n-th term in a generating-function expansion 

ch(F) = tr jexp (j^^j } = ch o(^) + ch^F) + ch 2 (F) + • • • , (4.101) 

where ch (F) = tr / is the dimension of the space on which the A a act. This 
formal sum of forms of different degree is called the total Chern character. 
The n\ normalization is chosen because it makes the Chern character behave 
nicely when we combine vector bundles. 

Given two vector bundles over the same manifold, having fibres U x and V x 
over the point x, we can make a new bundle with the direct sum U x © V x as 
fibre over x. This resulting bundle is called the Whitney sum of the bundles. 
Similarly we can make a tensor-product bundle whose fibre over x is U x © V x . 

Let us use the notation ch(U) to represent the Chern character of the 
bundle with fibres U x , and U © V to denote the Whitney sum. Then we have 

ch(U®V) = ch(U) + ch(V), (4.102) 

and 

ch(U®V) =ch([/) Ach(V). (4.103) 

The second of these formulae comes about because if is a Lie algebra 
element acting on V^ 1 -* and the corresponding element acting on V^ 2 \ 
then they act on the tensor product © as 

A(^ 2 ) = A«©7 + 7©Ai 2 ), (4.104) 

where I is the identity operator, and for matrices A, B, 

tr {exp (A © I + I © B)} = tr {exp A © exp B} = tr {exp A} tr {exp B} . 

(4.105) 
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In terms of the individual ch n (V) equations (4.102) and (4.103) read 

ch n (U ®V) = ch n (U) + ch n (V), (4.106) 

and 

n 

ch n (U ®V) = J2 cK-m(U) A ch m (V). (4.107) 

m=0 

Related to the Chern characters are the Chern classes. These are wedge- 
product polynomials in the Chern characters, and are defined, via the matrix 
expansion 

det (I + A) = 1 + trA + ^((trA) 2 -trA 2 ) + ..., (4.108) 
by the generating function for the total Chern class 

c(F) = det + ^-F^j = 1 + ci(F) + c 2 (F) + ■■■. (4.109) 

Thus 

d(F) = ch^F), c 2 (F) = ^chi(F) A chi(F) - ch 2 (F), (4.110) 
and so on. 

For matrices A and B we have det (A © B) = det (A) det (5), and this 
leads to 

c ([/©y)=c((7)Ac(V). (4.111) 

Although the Chern classes are more complicated in appearance than the 
Chern characters, they are introduced because their integrals over cycles are 
integers, and this property remains true of integer-coefficient sums of prod- 
ucts of Chern-classes. The cohomology classes [c n (F)] are therefore elements 
of the integer cohomology ring H'(M, Z). This property does not hold for 
the Chern characters, whose integrals over cycles can be fractions. The co- 
homology classes [ch n (F)] are therefore only elements of H'(M,Q). 

When we integrate products of Chern classes of total degree 2m over 
a closed 2m-dimensional orientable manifold we get integer Chern numbers. 
These integers can be related to generalized winding numbers, and character- 
ize the extent to which the gauge transformations that relate the connection 
fields in different patches serve to twist the vector bundle. Unfortunately 
it requires a considerable amount of machinery (the Schubert calculus of 
complex Grassmannians) to explain these integers. 
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Pontryagin and Euler classes 

When the fibres of a vector bundle are vector spaces over R, the complex 
skew-hermitian matrices i\ a are replaced by real skew symmetric matrices. 
The Lie algebra of the n-by-n matrices i\ a was a subalgebra of u(n). The Lie 
algebra of the n-by-n real, skew symmetric, matrices is a subalgebra of o(n). 
Now the trace of an odd power of any skew symmetric matrix is zero. As a 
consequence, Chern characters and Chern classes containing an odd number 
of F's all vanish. The remaining real 4n-forms are known as Pontryagin 
classes. The precise definition is 

p k (V) = (-l) k c 2k (V). (4.112) 

Pontryagin classes help to classify bundles whose gauge transformations 
are elements of O(n). If we restrict ourselves to gauge transformations that lie 
in SO(n), as we would when considering the tangent bundle of an orientable 
Riemann manifold, then we can make a gauge-invariant polynomial out of 
the skew-symmetric matrix-valued F by forming its Pfaffian. 

Recall (or see exercise ??.??) that the Pfaffian of a skew symmetric 2n- 
by-2n matrix A with entries is 

Pf A = -^^£ii,...i 2n a iii2 ' ' ' a i2n-ii2n- (4.113) 

The Euler class of the tangent bundle of a 2n-dimensional orientable manifold 
is defined via its skew-symmetric Riemann-curvature form 

R=^R ab ,^dx»dx» (4.114) 

to be 

e(R) = Pf ( 77-R ) ■ (4.115) 



2it 

In four dimensions, for example, this becomes the 4-form 

e(R) = -^e abcd R ab R cd . (4.116) 

The generalized Gauss-Bonnet theorem asserts, for an oriented, even-dimensional, 
manifold without boundary, that the Euler character is given by 

X (M) = [ e(R). (4.117) 

J M 
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We will not prove this theorem, but in section 7.3.6 we will illustrate the 
strategy that leads to Chern's influential proof. 

Exercise 4.5: Show that 



4.7 Hodge Theory and the Morse Index 



The Laplacian, when acting on a scalar function <fi in IR 3 is simply div (grad 0), 
but when acting on a vector v it becomes 



Is there a general construction that would have allowed us to write down this 
second expression? What about the Laplacian on other types of fields? 

The Laplacian acting on any vector or tensor field T in IR n is given, 
in general curvilinear co-ordinates, by V 2 T = g^ u V M V '„T where V M is the 
flat-space covariant derivative. This is the unique co-ordinate independent 
object that reduces in Cartesian co-ordinates to the ordinary Laplacian acting 
on the individual components of T. The proof that the rather different- 
seeming (4.118) holds for vectors is that it too is constructed out of co- 
ordinate independent operations and in Cartesian co-ordinates reduces to 
the ordinary Laplacian acting on the individual components of v. It must 
therefore coincide with the covariant derivative definition. Why it should 
work out this way is not exactly obvious. Now div, grad and curl can all be 
expressed in differential form language, and therefore so can the scalar and 
vector Laplacian. Moreover, when we let the Laplacian act on any p-form 
the general pattern becomes clear. The differential form definition of the 
Laplacian, and the exploration of its consequences, was the work of William 
Hodge in the 1930's. His theory has natural applications to the topology of 
manifolds. 

4.7.1 The Laplacian on ]9-forms 

Suppose that M is an oriented, compact, D-dimensional manifold without 
boundary. We can make the space Q P (M) of p-form fields on M into an L 2 




V 2 v = grad (div v) — curl (curl v). 



(4.118) 
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Hilbert space by introducing the positive-definite inner product 

(a,b) p = (b,a) p = [ a*b=- I d".r^ a n ,,..,J> : —^ . (4.119) 

JM P- J 

Here the subscript p denotes the order of the forms in the product, and should 
not to be confused with the p we have elsewhere used to label the norm in 
W Banach spaces. The presence of the y/g and the Hodge * operator tells 
us that this inner product depends on both the metric on M and the global 
orientation. 

We can use our new product to define a "hermitian adjoint" 5 = d' of 
the exterior differential operator d. The ". . ." are because this is not quite 
an adjoint operator in the normal sense — d takes us from one vector space 
to another — but it is constructed in an analogous manner. We define S by 
requiring that 

(da, b) p+1 = (a, 5b) p , (4.120) 

where a is an arbitrary p-form and b and arbitrary {p + l)-form. Now recall 
that * takes p- forms to {D — p) forms, and so d * b is a (D —p) form. Acting 
twice on a (D — p)-form with * gives us back the original form multiplied by 
(_1)p(-D-p)_ w e use this to compute 

d(a*b) = da*b+ (-l) p a(d*b) 

= da*b+ (-l) p (-l) p(z? - p) a*(*d*&) 

= da*b- (-l) Dp+1 a*{*d*b). (4.121) 

In obtaining the last line we have observed that p(p — 1) is an even integer 
and so (-I^-p) = 1. Now, using Stokes' theorem, and the absence of a 
boundary to discard the integrated-out part, we conclude that 

/ (da)*b= (-l) Dp+1 I a*(*d*b), (4.122) 
Jm Jm 

or 

(da,b) p+1 = (-l)^ +1 (a,(*d*)6> p (4.123) 

and so 5b = (— l) Dp+1 (-k d-k)b. This was for 5 acting on a (p — 1) form. Acting 
on a p form we have 

5= (-l) D P+ D+1 *d*. (4.124) 
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Observe how the sequence of maps in -kd-k works: 

qp(M) Q D - p (M) fi D - p+1 (M) ^(M). (4.125) 

The net effect is that 5 takes a p-form to a (p — l)-form. Observe also that 
5 2 oc * d 2 * = 0. 

We now define a second-order partial differential operator A p to be the 
combination 

A p = 5d + d5, (4.126) 

acting on p-forms This maps a p-form to a p-form. A slightly tedious calcu- 
lation in cartesian co-ordinates will show that, for flat space, 

A p = -V 2 (4.127) 

on each component of a p-form. This A p is therefore the natural definition 
for (minus) the Laplacian acting on differential forms. It is usually called the 
Laplace- Beltrami operator. 

Using (a, db) = (5a, b) we have 

((5d + d5)a, b) p = (5a, Sb)^ + (da, db) p+1 = (a, {5d + d5)b) p , (4.128) 

and so we deduce that A p is self-adjoint on Vt p (M). The middle terms in 
(4.128) are both positive, so we also see that A p is a positive operator — i.e. 
all its eigenvalues are positive or zero. 

Suppose that A p a = 0, then (4.128) for a = b becomes that 

= (5a, Sa)^ + (da, da) p+1 . (4.129) 

Because both these inner products are positive or zero, the vanishing of 
their sum requires them to be individually zero. Thus A p a = implies that 
da — 5a — 0. By analogy with harmonic functions, we call a form that is 
annihilated by A p a harmonic form. Recall that a form a is closed if da = 0. 
We correspondingly say that a is co-closed if 5a=0. A differential form is 
therefore harmonic if and only if it is both closed and co-closed. 

When a self-adjoint operator A is Fredholm (i.e the solutions of the equa- 
tion Ax = y are governed by the Fredholm alternative) the vector space on 
which it acts is decomposed into a direct sum of the kernel and range of the 
operator 

V = Ker(A) @lm(A). (4.130) 
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It may be shown that our Laplace-Beltrami A p is a Fredholm operator, and 
so for any p-form uj there is an r\ such that uj can be written as 

uj = (dS + 5d)rj + 7 

= da + 5(3 + 7, (4.131) 

where a = 5r), (3 = dr], and 7 is harmonic. This result is known as the 
Hodge decomposition of uj. It is a form-language generalization of the of the 
Hodge- Weyl and Helmholtz-Hodge decompositions of chapter ??. It is easy 
to see that a, f3 and 7 are uniquely determined by u. If they were not then 
we could find some a, (3 and 7 such that 

= da + 5(3 + 7 (4.132) 

with non-zero da, 5(3 and 7. To see that this is not possible, take the d of 
(4.132) and then the inner product of the result with (3. Because d(da) = 
g?7 = 0, we end up with 

= ((3,d5(3) 

= {5(3,5(3}. (4.133) 

Thus 5(3 = 0. Now apply 5 to the two remaining terms of (4.132) and take an 
inner product with a. Because £7 = 0, we find (da, da) = 0, and so da = 0. 
What now remains of (4.132) asserts that 7 = 0. 

Suppose that u is closed. Then our strategy of taking the d of the de- 
composition 

uj = da + 5(3 + 7, (4.134) 

followed by an inner product with (3 leads to 5(3 = 0. A closed form can thus 
be decomposed as 

uj = da + -i (4.135) 

with a and 7 unique. Each cohomology class in H P (M) therefore contains 
a unique harmonic representative. Since any harmonic function is closed, 
and hence a representative of some cohomology class, we conclude that there 
is a 1-1 correspondence between p-form solutions of Laplace's equation and 
elements of H P (M). In particular 



dim(Ker A p ) = dim (H P (M)) = h 



(4.136) 
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Here b p is the p-th Betti number. From this we immediately deduce that 

D 

X(M) = £(-l)*dim(Ker A p ), (4.137) 

where x(M) is the Euler character of M. There is therefore an intimate 
relationship between the null-spaces of the second-order partial differential 
operators A p and the global topology of the manifold in which they live. 
This is an example of an index theorem. 

Just as for the ordinary Laplace operator, A p has a complete set of eigen- 
functions with associated eigenvalues A. Because the the manifold is compact 
and hence has finite volume, the spectrum will be discrete. Remarkably, the 
topological influence we uncovered above is restricted to the zero-eigenvalue 
spaces. Suppose that we have a p-form eigenfunction u\ for A p : 

A p u x = Xu x . (4.138) 

Then 

A du\ = d A p u\ 

= d(d5 + 8d)u\ 

= (dS)du\ 

= (Sd + dS)du\ 

= A p+l du x . (4.139) 

Thus, provided it is not identically zero, du\ is an (p+ l)-form eigenfunction 
of A(p + i) with eigenvalue A. Similarly, 5u\ is a (p — l)-form eigenfunction 
also with eigenvalue A. 

Can du\ be zero? Yes! It will certainly be zero if u\ itself is the d of 
something. What is less obvious is that it will be zero only if it is the d of 
something. To see this suppose that du\ = and A ^ 0. Then 

\u x = (Sd + d5)u x = d(6u x ). (4.140) 

Thus du\ = implies that u\ = drj, where r\ = 5u\/\. We see that for A 
non-zero, the operators d and 5 map the A eigenspaces of A into one another, 
and the kernel of d acting on p-form eigenf unctions is precisely the image of 
d acting on (p — l)-form eigenfunctions. In other words, when restricted to 
positive A eigenspaces of A, the cohomology is trivial. 
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The set of spaces V* together with the maps d : — > V* +1 therefore 
constitute an exact sequence when A 7^ 0, and so the alternating sum of their 
dimension must be zero. We have therefore established that 

£(-l)Mim^={*< M >- X = l (4.141) 

All the topology resides in the null-spaces, therefore. 

Exercise 4.6: Show that if u is closed and co-closed then so is *u>. Deduce 
that in a for a compact orientable D-manifold we have b p = bo-p ■ This 
observation therefore gives another way of understanding Poincare duality. 



4.7.2 Morse Theory 

Suppose, as in the previous section, M is a ZJ-dimensional compact manifold 
without boundary and V : M — > K. a smooth function. The global topology 
of M imposes some constraints on the possible maxima, minima and saddle 
points of V . Suppose that P is a stationary point of V . Taking co-ordinates 
such that P is at = 0, we can expand 

V(x) = V(0) + l -H^x v + .... (4.142) 
Here, the matrix H^ u is the Hessian 

d 2 V 



H,.v = 



(4.143) 



We can change co-ordinates so as reduce the Hessian to a canonical form 
with only ±1,0 on the diagonal: 

= ( m I n J . (4.144) 

If there are no zero's on the diagonal then the stationary points is said to be 
non-degenerate. The the number m of downward-bending directions is then 
called the index of V at P. If P were a local maximum, then m = D, n = 0. 
If it were a local minimum then m = 0, n = D. When all its stationary 
points are non-degenerate, V is said to be a Morse function. This is the 
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generic case. Degenerate stationary points can be regarded as arising from 
the merging of two or more non-degenerate points. 

The Morse index theorem asserts that if V is a Morse function, and if 
we define Nq to be the number of stationary points with index (i.e. local 
minima), and N\ to be the number of stationary points with index 1 etc., 
then 



Here x{M) is the Euler character of M. Thus, a function on the two- 
dimensional torus, which has \ = 0, can have a local maximum, a local 
minimum and two saddle points, but cannot have only one local maximum, 
one local minimum and no saddle points. On a two-sphere (x = 2), if V has 
one local maximum and one local minimum it can have no saddle points. 

Closely related to the Morse index theorem is the Poincare-Hopf theorem. 
It counts the isolated zeros of a tangent-vector field X on a compact D- 
manifold and, among other things, explains why we cannot comb a hairy 
ball. An isolated zero is a point z n at which X becomes zero, and that has a 
neighbourhood in which there is no other zero. If there are only finitely many 
zeros then each of them will be isolated. We can define a vector field index at 
z n by surrounding it with a small (D — l)-sphere on which X does not vanish. 
The direction of X at each point on this sphere then provides a map from the 
sphere to itself. The index i(z n ) is defined to be the winding number (Brouwer 
degree) of this map. The index can be any integer, but in the special case 
that X is the gradient of a Morse function we have i(z n ) = (— l) m ™ where m 
is the Morse index at z n . 



D 




(4.145) 



m=0 




a) 



b) 



c) 



Figure 4.13: Two-dimensional vector- Gelds and their streamlines near zeros 
with indices a) i(z a ) = +1, b) i(zb) = — 1, c) i(z c ) = +1. 



4.7. HODGE THEORY AND THE MORSE INDEX 



161 



The Poincare-Hopf theorem now states that, for a compact manifold with- 
out boundary, and for a tangent vector field with only finitely many zeros, 

*0n) = X(M). (4.146) 

zeros n 

A tangent- vector field must therefore always have at least one zero unless 
x(M) = 0. Since the two-sphere has % = 2, it cannot be combed. 




Figure 4.14: Gradient vector field and streamilines in a two-simplex. 

If one is prepared to believe that J^zcms i( z n) is the same integer for all 
tangent vector fields X on M, it is simple to show that this integer must 
be equal to the Euler character of M. Consider, for ease of visualization, 
a two-manifold. Triangulate M and take X to be the gradient field of a 
function with local minima at each vertices, saddle points on the edges, and 
local maxima at the centre of each face (see figure 4.14). It must be clear 
that this particular field X has 

£ i{z n ) = V-E + F = X (M). (4.147) 

zeros n 

In the case of a two-dimensional oriented surface equipped with a smooth 
metric, it is also simple to demonstrate the invariance of the index sum. 
Consider two vector fields X and Y. Triangulate M so that all zeros of both 
fields lie in the interior of the faces of the simplices. The metric allows us 
to compute the angle 9 between X and Y wherever they are both non-zero, 
and in particular on the edges of the simplices. For each two-simplex a we 
compute the total change A8 in the angle as we circumnavigate its boundary. 
This change is an integral multiple of 2tt, with the integer counting the 
difference 

zeros of X£a zeros of Yda 
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of the indices of the zeros within a. On summing over all triangles a, each 
edge is traversed twice, once in each direction, so ^ CT A6 vanishes . The total 
index of X is therefore the same as that of Y. 

This pairwise cancellation argument can be extended to non-orientable 
surfaces, such as the projective plane, In this case the edges constituting the 
homological "boundary" of the closed surface are traversed twice in the same 
direction, but the angle 6 at a point on one edge is paired with — 9 at the 
corresponding point of the other edge. 

Supersymmetric Quantum Mechanics 

Edward Witten gave a beautiful proof of the Morse index theorem for an 
orientable manifold by re-interpreting the Laplace-Beltrami operator as the 
Hamiltonian of supersymmetric quantum mechanics on M. Witten's idea had 
a profound impact, and led to quantum physics serving as a rich source of 
inspiration and insight for mathematicians. We have seen most of the ingre- 
dients of this re-interpretation in previous chapters. Indeed you should have 
experienced a sense of deja vu when you saw d and 5 mapping eigenf unctions 
of one differential operator into eigenfunctions of a related operator. 

We begin with an novel way to think of the calculus of differential forms. 
We introduce a set of fermion annihilation and creation operators ^ and 
ip^ which anti-commute, = —^"vp^, and obey 

i?} = V> f V + = 9^- (4.149) 

Here \i runs from 1 to D. As is usual when we are given such operators, 
we also introduce a vacuum state |0) which is killed by all the annihilation 
operators: ^\0) = 0. The states 

(V^r^V 2 • • • (4>H Pn \ty, (4-150) 

with each of the pi taking the value one or zero, then constitute a basis for 
2 D -dimensional space. We call p = J2iPi the fermion number of the state. 
We now assume that (0 1 0) = 1 and use the anti-commutation relations to 
show that 

(o|^ . . . v 2 V x ■ ■ ■ ^"V* • • • ^^lo) 

is zero unless p = q, in which case it is equal to 

g ^i g ^2u 2 _ _ _ gl i v u p ± (permutations). 



4.7. HODGE THEORY AND THE MORSE INDEX 



163 



We now make the correspondence 

V^...^)^ 1 ^ 2 • • .^|0> - i/ M1M2 ... M >)d^W 2 . . . dx*>, 

(4.151) 

to identify p-fermion states with p-forms. We think of fn U i 2 ... t i p {x) as being 
the wavefunction of a particle moving on M, with the subscripts informing 
us there are fermions occupying the states /ij. It is then natural to take the 
inner product of 

\a) = ^a,„ p (x)^^ 2 . . .^|o) (4.152) 

and 

\b) = ^b m .., q (x)^^ . . .^| ) (4.153) 

to be 

M> = [ d D x v ^^a; iw ...^.^(0|^...r i ^---^|0> 
Jm P-Q- 

= 5 pq f ^x^g^a^b^^. (4.154) 

This coincides the Hodge inner product of the corresponding forms. 

If we lower the index by setting ip^ to be g^ip^ then the action of X^ip^ 
on a p-fermion state coincides with the action of the interior multiplication 
%x on the corresponding p-form. All the other operations of the exterior 
calculus can also be expressed in terms of the ^'s. In particular, in Cartesian 
co-ordinates where g^ v = 5^ v , we can identify d with ip^^n- To find the 
operator that corresponds to the Hodge 5, we compute 

S = S = = = -d„V = -V%- (4.155) 

The hermitian adjoint of <9 M is here being taken with respect to the standard 
L 2 (M. D ) inner product. This computation becomes more complicated when 
when g^ becomes position dependent. The adjoint eft then involves the 
derivative of y/g, and ip and no longer commute. For this reason, and 
because such complications are inessential for what follows, we will delay 
discussing this general case until the end of this section. 

Having found a simple formula for 5, it is now automatic to compute 

d5 + 5d= -{^, ^} d^d u = -S^dpd,, = -V 2 . (4.156) 
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This much easier than deriving the same result by using 5 = {—l) Dp+D+1 -kd-k. 

Witten's fermionic formalism simplifies a number of compuations involv- 
ing S, but his real innovation was to consider a deformation of the exterior 
calculus by introducing the operators 

d t = e- tv ^de tv(x \ 8 t = e tv ^5e- tv ^\ (4.157) 

and 

A t = d t 6 t + S t d t . (4.158) 

Here V(x) is the Morse function whose stationary points we are seeking to 
count. 

The deformed derivative continues to obey df = 0, and duo = if and only 
if d t e~ tv uj = 0. Similarly, if u — dr] then e~ tv u> = d t e~ tv r]. The cohomol- 
ogy of d and d t are therefore transformed into each other by multiplication 
by e~ tv . Since the exponential function is never zero, this correspondence 
is invertible and the mapping is an isomorphism. In particular, the Betti 
numbers b p , the dimensions of Ker (d t ) p /lm (d t ) p -i, are t independent. Fur- 
ther, the t-deformed Laplace-Beltrami operator remains Fredholm with only 
positive or zero eigenvalues. We can make a Hodge decomposition 

to = d t a + 5 t (3 + 7, (4.159) 

where A t j = 0, and concude that 

dim (Ker (A t ) p ) = b p (4.160) 

as before. The non-zero eigenvalue spaces will also continue to form exact 
sequences. Nothing seems to have changed! Why do we introduce dt then? 
The motivation is that when t becomes large we can use our knowledge of 
quantum mechanics to compute the Morse index. 
To do this, we expand out 

d t = ^\d, + td,V) 

5 t = -Vid^-td^V) (4.161) 

and find 

dA + S t d t = -V 2 + t 2 \ VVf + tW, r\ d%V. (4.162) 

This can be thought of as a Schrodinger Hamiltonian on M containing a 
potential and a fermionic term. When t is large and positive the potential 
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t 2 |VV^| 2 will be large everywhere except near those points where W = 0. 
The wavefunctions of all low-energy states, and in particular all zero-energy 
states, will therefore be concentrated at precisely the stationary points we are 
investigating. Let us focus on a particular stationary point, which we will 
take as the origin of our co-ordinate system, and identify any zero-energy 
state localized there. We first rotate the coordinate system about the origin 
so that the Hessian matrix d 2 V\§ becomes diagonal with eigenvalues A„. 
The Schrodinger problem can then be approximated by a sum of harmonic 
oscillator hamiltonians 

A ^ w E {-^+t 2 X 2 x 2 + t\ l [^\^}\ . (4.163) 

i=l v i J 

The commutator [ip^, takes the value +1 if the i'th fermion state is oc- 
cupied, and —1 if it is not. The spectrum of the approximate Hamiltonian 
is therefore 

D 

t^{|A i |(l + 2n l )±A J }. (4.164) 
i=i 

Here the label the harmonic oscillator states. The lowest energy states 
will have all the rii = 0. To get a state with zero energy we must arrange 
for the ± sign to be negative (no fermion in state i) whenever Aj is positive, 
and to be positive (fermion state i occupied) whenever Aj is negative. The 
fermion number "p" of the zero-energy state is therefore equal to the the 
number of negative A, — i.e. to the index of the critical point! We can, 
in this manner, find one zero-energy state for each critical point. All other 
states have energies proportional t, and therefore large. Since the number 
of zero energy states having fermion number p is the Betti number b p , the 
harmonic oscillator approximation suggests that b p = N p . 

If we could trust our computation of the energy spectrum, we would have 
established the Morse theorem 

D D 

Et-w = Et- 1 )^ = *( M )' ( 4 - 165 ) 

p=0 p=0 

by having the two sums agree term by term. Our computation is only ap- 
proximate, however. While there can be no more zero-energy states than 
those we have found, some states that appear to be zero modes may instead 
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have small positive energy. This might arise from tunnelling between the 
different potential minima, or from the higher-order corrections to the har- 
monic oscillator potentials, both effects we have neglected. We can therefore 
only be confident that 



The remarkable thing is that, for the Morse index, this does not matter] If 
one of our putative zero modes gains a small positive energy, it is now in 
the non-zero eigenvalue sector of the spectrum. The exact-sequence property 
therefore tells us that one of the other putative zero modes must also be a 
not-quite-zero mode state with exactly the same energy. This second state 
will have a fermion number that differs from the first by plus or minus one. 
Our error in counting the zero energy states therefore cancels out when we 
take the alternating sum. Our unreliable estimate b p ~ N p has thus provided 
us with an exact computation of the Morse index. 

We have described Witten's argument as if the manifold M were flat. 
When the manifold M is not flat, however, the curvature will not affect 
our computations. Once the parameter t is large the low-energy eigenfunc- 
tions will be so tightly localized about the critical points that they will be 
hard-pressed to detect the curvature. Even if the curvature can effect an 
infintesimal energy shift, the exact-sequence argument again shows that this 
does not affect the alternating sum. 

The Weitzenbock Formula 

Although we we were able to evade them when proving the Morse index 
theorem, it is interesting to uncover the workings of the nitty-gritty Rie- 
mann tensor index machinary that lie concealed behind the polished facade 
of Hodge's d, 5 calculus. 

Let us assume that our manifold M is equipped with a torsion-free con- 
nection = T^xu, and use this connection to define the action of an 
operator V M by specifying its commutators with c- number functions /, and 
with the ^ and t/j^'s: 




(4.166) 



[V M ,/] = d,f, 
[V M ,V1 = -T\x^\ 



(4.167) 
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We also set V M |0) = 0. These rules allow us to compute the action of V M on 
/w,.,,^)^" 1 • • • V^|0>. For example 

v,(/^no» = ([v^/^n+z^v^io) 

= ([v„/,]^ + / a [v,^ ta ])|o) 
= {d,u - f a r a ^\o) 

= {VMtf"\0), (4.168) 

where 

V„/ w = dj v - V\ v f a , (4.169) 

is the usual covariant derivative acting on the componenents of a covariant 
vector. 

The metric g^ v counts as a c-number function, and so [V a ,5 w ] is not 
zero, but is instead d a g^ v . This might be disturbing — being able pass the 
metric through a covariant derivative is a basic compatibilty condition in 
Riemann geometry — but all is not lost. V M (with a caret) is not quite the 
same beast as V M . We proceed as follows: 

o a cr = [v Q ,<r] 

= [v Q ,v f V] + [v Q ,^ tM ,] 

= -g^T^x-g^T^x. (4.170) 

We conclude that 

V + 9^\x + 9 Xu T\x = V a! T = 0. (4.171) 

Metric compatibility is therefore satisfied, and the connection is therefore the 
standard Riemannian 

TV = \g aX (d.gxu + d u g, x - d x g,u) ■ (4.172) 

Knowing this, we can compute the adjoint of V M : 

= -(V M + T%). (4.173) 
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That V v V[li is the logarithmic derivative of ^fg is a standard identity for the 
Riemann connection (see exercise 2.14). The resultant formula for (V M )t 
can be used to verify that the second and third equations in (4.167) are 
compatible with each other. 

We can also compute [[V M , Vj,], ip a ] and from it deduce that 

[V M ,V,] = J R CTA ^ t V, (4.174) 

where 

R a f3fny = dfj" 31 ^ — d u r a pn + T a x l jT x l 3 l/ — T a \ v v x ^ (4.175) 

is the Riemann curvature tensor. 
We now define d to be 

d = V fM V M . (4.176) 

Its action coincides with the usual d because the symmetry of the T^'s 
ensures that their contributions cancel. From this we find that 5 is 

= %V 

= -(v M + rv)^ 

= -r(v M + rv) + rv^ 

= -^V M . (4.177) 
The Laplace-Beltrami operator can now be worked out as 

d5 + 5d = - (y^V^Vv + ^V^V M ) 

= - ({^ tM , ^}(v M v, - r%„v CT ) + v^ tM [v„ V M ]) 

= -(r(V,t-rVV,)+f^V> A ^) (4.178) 

By making use of the symmetries R C \ VIX = R Vjxa \ and R a \ Vil = -R a \^ we 
can tidy up the curvature term to get 

d5 + 5d= -<T (V M V„ - T\ u V a ) - ^ V^V^W (4.179) 

This result is called the Weitzenbock formula. An equivalent formula can be 
derived directly from (4.124), but only with a great deal more effort. The part 



4.7. HODGE THEORY AND THE MORSE INDEX 



169 



without the curvature tensor is called the Bochner Laplacian. It is normally 
written as B = —g^W^Wu with V M being understood to be acting on the 
index u, and therefore tacitly containing the extra T a that must be made 

explicit when we define the action of V M via commutators. The Bochner 
Laplacian can also be written as 

B = Vlg^V u (4.180) 

which shows that it is a positive operator. 
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Chapter 5 



Groups and Group 
Representations 

Groups appear in physics as symmetries of the system we are studying. Often 
the symmetry operation involves a linear transformation, and this naturally 
leads to the idea of finding sets of matrices having the same multiplication 
table as the group. These sets are called representations of the group. Given 
a group, we endeavour to find and classify all possible representations. 

5.1 Basic Ideas 

We begin with a rapid review of basic group theory. 
5.1.1 Group Axioms 

A group G is a set with a binary operation that assigns to each ordered pair 
{91,92) of elements a third element, #3, usually written with multiplicative 
notation as g 3 = <?i<?2- The binary operation, or product, obeys the following 
rules: 

i) Associativity: 91 (g 2 gz) = {9192)93- 

ii) Existence of an identity: There is an element 1 e G G such that eg = g 
for all g e G. 

1 The symbol "e" is often used for the identity element, from the German Einheit, 
meaning "unity." 
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iii) Existence of an inverse: For each g e G there is an element g" 1 such 
that g~ 1 g = e. 

From these axioms there follow some conclusions that are so basic that 
they are often included in the axioms themselves, but since they are not 
independent, we state them as corollaries. 

Corollary i): gg' 1 = e. 

Proof: Start from g~ l g = e, and multiply on the right by g^ 1 to get 
g~ x gg~ x = eg^ 1 = g -1 , where we have used the left identity property of 
e at the last step. Now multiply on the left by (g -1 )^ 1 , and use associativity 
to get gg~ x = e. 

Corollary ii): ge = g. 

Proof: Write ge = g{g~ x g) = {gg~ X )g = eg = g. 
Corollary iii): The identity e is unique. 

Proof: Suppose there is another element t\ such that e 1 g = eg = g. Multiply 
on the right by g^ 1 to get e±e = e 2 = e, but e x e — e±, so e± — e. 

Corollary iv): The inverse of a given element g is unique. 
Proof: Let gig = g 2 g = e. Use the result of corollary (i), that any left 
inverse is also a right inverse, to multiply on the right by g^ 1 , and so find 
that g x = g 2 . 

Two elements g± and g 2 are said to commute if gig 2 = g29\- If the group 
has the property that g x g 2 = g 2 g\ for all gi,g 2 G G, it is said to be Abelian, 
otherwise it is non-Abelian. 

If the set G contains only finitely many elements, the group G is said to 
be finite. The number of elements in the group, |G|, is called the order of 
the group. 

Examples of Groups: 

1) The integers Z under addition. The binary operation is (n, m) \— > n+m, 
and "0" plays the role of the identity element. This is not a finite group. 

2) The integers modulo n under addition. (m,m') h- > m+m', modn. This 
group is denoted by Z„. 

3) The non-zero integers modulo p (a prime) under multiplication (to, to') h- > 
toto', modp. Here "1" is the identity element. If the modulus is not 
a prime number, we do not get a group (why not?). This group is 
sometimes denoted by (Z p ) x . 
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4) The set of numbers {2, 4, 6, 8} under multication modulo 10. Here, the 
number "6" plays the role of the identity! 

5) The set of functions 

h{z) = z, f 2 (z) 

fa(z) = -, h{z) 

z 

with (fi, fj) n/jO fj. Here the "o" is a standard notation for compo- 
sition of functions: (fi o fj)(z) = fi(fj(z)). 

6) The set of rotations in three dimensions, equivalently the set of 3-by-3 
real matrices O, obeying T = I, and detO = 1. This is the group 
SO (3). SO(n) is defined analogously as the group of rotations in n 
dimensions. If we relax the condition on the determinant we get the 
orthogonal group 0(n). Both SO(n) and O(n) are examples of Lie 
groups. A Lie group a group that is also a manifold M, and whose 
multiplication law is a smooth function M x M — > M. 

7) Groups are often specified by giving a list of generators and relations. 
For example the cyclic group of order n, denoted by C n , is specified by 
giving the generator a and relation a n = e. Similarly, the dihedral group 
D n has two generators a, b and relations a n = e, b 2 = e, (ab) 2 = e. 
This group has order In. 

5.1.2 Elementary Properties 

Here are the basic properties of groups that we need: 

i) Subgroups: If a subset of elements of a group forms a group, it is 
called a subgroup. For example, Z 12 has a subgroup of consisting of 
{0,3,6,9}. All groups have at least two subgroups: the trivial sub- 
groups G itself, and {e}. Any other subgroups are called proper sub- 
groups. 

ii) Cosets: Given a subgroup H C G, having elements {hi, h 2 , . . .}, and 
an element g G G, we form the (left) coset gH = {gh\,gh 2) . . .}. If two 
cosets g±H and g 2 H intersect, they coincide. (Proof: if g-Jii = g 2 h 2 , 
then g 2 = g\{hih 2 l ) and so g±H = g 2 H .) If if is a finite group, 
each coset has the same number of distinct elements as H . (Proof: if 
ghi = gh 2 then left multiplication by g' 1 shows that hi = h 2 .) If the 
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order of G is also finite, the group G is decomposed into an integer 
number of cosets, 

G = giH + g 2 H H , (5.1) 

where "+" denotes the union of disjoint sets. From this we see that the 
order of H must divide the order of G. This result is called Lagrange's 
theorem. The set whose elements are the cosets is denoted by G/H. 

iii) Normal subgroups and quotient groups: A subgroup H of G is said 
to be normal, or invariant, if g~ x Hg = H for all g G G. Given a 
normal subgroup H, we can define a multiplication rule on the coset 
space cosets G/H = {g\H, g 2 H, . . .} by taking a representative element 
from each of g^H , and g^H , taking the product of these elements, and 
defining [g i H){gjH) to be the coset in which this product lies. This 
coset is independent of the representative elements chosen (this would 
not be so if the subgroup was not normal). The resulting group is 
called the quotient group G/H. (Note that the symbol "G/H" is used 
to denote both the set of cosets, and, when it exists, the group whose 
elements are these cosets.) 

iv) Simple groups: A group G with no normal subgroups is said to be sim- 
ple. The finite simple groups have been classified. They fall into various 
infinite families (Cyclic groups, Alternating groups, 16 families of Lie 
type) together with 26 sporadic groups, the largest of which, the Mon- 
ster, has order 808,017,424,794,512,875,886,459,904,961,710,757,005, 754, 
368,000,000,000. The mysterious "Monstrous moonshine" links its rep- 
resentation theory to the elliptic modular function J(r) and to string 
theory. 

iv) Conjugacy and Conjugacy Classes: Two group elements g±, g 2 are said 
to be conjugate in G if there is an element g G G such that g 2 = g~ l g±g. 
If gi is conjugate to g 2 , we write g\ ~ g 2 . Conjugacy is an equivalence 
relation? and, for finite groups, the resulting conjugacy classes have 
order that divide the order of G. To see this, consider the conjugacy 
class containing an element g. Observe that the set H of elements 
h G G such that h~ 1 gh = g forms a subgroup. The set of elements 

2 An equivalence relation, <~, is a binary relation that is 

i) Reflexive: A <~ A. 

ii) Symmetric: A <~ B B ~ A. 

iii) Transitive: A ~ B, B - C => A~C 

Such a relation breaks a set up into disjoint equivalence classes. 
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conjugate to g can be identified with the coset space G/H. The order 
of G divided by the order of the conjugacy class is therefore \H\. 
Example: In the rotation group SO(3), the conjugacy classes are the sets of 
rotations through the same angle, but about different axes. 
Example: In the group U(n), of n-by-n unitary matrices, the conjugacy 
classes are the set of matrices possessing the same eigenvalues. 
Example: Permutations. The permutation group on n objects, S n , has order 
n\. Suppose we consider permutations 7Ti, ir 2 in S$ such that 7Ti that maps 

1 2 3 4 5 6 7 

TTlMllllllll 



and 7r 2 maps 



2 3 1 5 4 7 6 8 
1 2 3 4 5 6 7 

7T2:(!!III!!! 



2 3 4 5 6 7 8 1 
The product 7r 2 o t\ 1 then takes 

/l 2 3 4 5 6 7 8 
7T 2 O 7Ti : M | | | | | | | 
\3 4 2 6 5 8 7 1 

We can write these partitions out more compactly by using Paolo Ruffini's 
cycle notation: 

7Ti = (123) (45) (67) (8), tt 2 = (12345678), ?r 2 o m = (132468) (5) (7). 

In this notation, each number is mapped to the one immediately to its right, 
with the last number in each bracket, or cycle, wrapping round to map to 
the first. Thus tti(I) = 2, 7^(2) = 3, tti(3) = 1. The "8", being both first 
and last in its cycle, maps to itself: 7Ti(8) = 8. Any permutation with this 
cycle pattern, (* * *)(**)(**)(*), is in the same conjugacy class as 7Ti. We 
say that 7Ti possesses one 1-cycle, two 2-cycles, and one 3-cycle. The class 
(ri, r 2 , . . . r n ) having r\ 1-cycles, r 2 2-cycles etc., where ri+2r 2 + - • -+nr n = n, 
contains 

Nr ^ = 

(ri,r2 - l^(r 1 \)2^(r 2 \)---n^(r n \) 
elements. The sign of the permutation, 



sgn7r = e 7r (i) 7r ( 2 ) 7r ( 3 )... 7r („) 
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is equal to 

sgnvr = (+l) ri (-l) r2 (+l) r3 (-l) r4 • • • . 
We have, for any two permutations 7Ti,7T2 

sgn (TTi)sgn (tt 2 ) = sgn (m o tt 2 ), 

so the even (sgn7r = +1) permutations form an invariant subgroup called 
the Alternating group, A n . The group A n is simple for n > 5, and Ruffini 
(1801) showed that this simplicity prevents the solution of the general quin- 
tic by radicals. His work was ignored, however, and later independently 
rediscovered by Abel (1824) and Galois (1829). 

If we write out the group elements in some order {e, g±, g 2 , ■ ■ ■}, and then 
multiply on the left 

9{e, 9i, 92, ■ ■ ■} = {9,99i,992,- ■ •} 

then the ordered list {g, gg±, gg 2 , ■ ■ ■} is a permutation of the original list. 
Any group is therefore a subgroup of S\g\- This is called Cayley's Theorem. 

Exercise 5.1: Let Hi, H 2 be two subgroups of a group G. Show that Hi n H 2 
is also a subgroup. 

Exercise 5.2: Let G be any group. 

a) The subset Z(G) of G consisting of those g € G that commute with all 
other elements of the group is called the center of the group. Show that 
Z(G) is a subgroup of G. 

b) If g is an element of G, the set Cc{g) of elements of G that commute 
with g is called the centeralizer of g in G. Show that it is a subgroup of 
G. 

c) If H is a subgroup, the set of elements of G that commute with all 
elements of H is the centralizer Cg(H) of H in G. Show that it is a 
subgroup of G. 

d) If H is a subgroup, the set Nq(H) C G consisting of those g such that 
g~ 1 Hg = H is called the normalizer of H in G. Show that Ng(H) is a 
subgroup of G, and that H is a normal subgroup of Ng{H). 

Exercise 5.3: Show that the set of powers a n of an element a £ G form a 
subgroup. Let p be prime. Recall that the set {1,2, . . .p — 1} forms the group 
(Z p ) x under multiplication modulo p. By appealing to Lagrange's theorem, 
prove Fermat's little theorem that for any prime p and integer a, we have 
a p_1 = 1, modp. 
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Exercise 5.4: Use Fermat's theorem from the previous excercise to establish 
the mathematical identity underlying the RSA algorithm for public-key cryp- 
tography: Let p, q be prime and N = pq. First use Euclid's algorithm for the 
HCF of two numbers to show that if the integer e is co- prime to 3 (p — 1) (q — 1) , 
then there is an integer d such that 

de = 1, mod (p — l)(q — 1). 

Then show that if, 

C = M e , modiV, (encryption) 

then 

M = C d , modiV. (decryption). 

The numbers e and TV" can be made known to the public, but it is hard to find 
the secret decoding key, d, unless the factors p and q of N are known. 

Exercise 5.5: Consider the group Q with multiplication table shown in table 
5.1. 



Q 


I 


A 


B 


C 


D 


E 


I 


I 


A 


B 


C 


D 


E 


A 


A 


B 


I 


E 


C 


D 


B 


B 


I 


A 


D 


E 


C 


C 


C 


D 


E 


I 


A 


B 


D 


D 


E 


C 


B 


I 


A 


E 


E 


C 


D 


A 


B 


I 



Table 5.1: Multiplication table of Q. To find AB look in row A column B. 

This group has proper a subgroup Ti = {I,A,B}, and corresponding (left) 
cosets are IH = {I, A, B} and CH = {C, D, E}. 

(i) Construct the conjugacy classes of this group. 

(ii) Show that {I, A, B} and {C, D, E} are indeed the left cosets of H. 

(iii) Determine whether 7i is a normal subgroup. 

(iv) If so, construct the group multiplication table for the corresponding quo- 
tient group. 

3 Has no factors in common with. 
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Exercise 5.6: Let H and K, be groups. Make the cartesian product G = HxK 
into a group by introducing a multiplication rule for elements of the Cartesian 
product by setting: 

(hi, h) * (h 2 , k 2 ) = (hih 2 , hk 2 ). 

Show that G, equipped with * as its product, satsifies the group axioms. The 
resultant group is called the direct product of H and K. 

Exercise 5. 7: If F and G are groups, a map ip : F — > G that preserves the group 
structure, i.e. if (p(gi)ip(g 2 ) = <p(gig 2 ), is called a group homomorphism. If 
ip is such a homomorphism show that <p( e F) = e-G, where ep, and ec are the 
identity element in F, G respectively. 

Exercise 5.8:. If tp : F — > G is a group homomorphism, and if we define Ker((p) 
as the set of elements / € F that map to ea, show that Ker(^) is a normal 
subgroup of F. 

5.1.3 Group Actions on Sets 

Groups usually appear in physics as symmetries: they act on a physical 
object to change it in some way, perhaps while leaving some other property 
invariant. 

Suppose X is a set. We call its elements "points." A group action on X 
is a map g G G : X — > X that takes a point x G X to a new point that we 
denote by grr G X, and such that g 2 (gix) = (gig 2 )x, and ex = x. There is 
some standard vocabulary for group actions: 

i) Given a a point x G X we define the orto of re to be the set Gx = 

ii) The action of the group is transitive if any orbit is the whole of X. 
in) The action is effective, or faithful, if the map <? : X — > X being the 

identity map implies that g = e. Another way of saying this is that 
the action is effective if the map G — > Map (X — > X) is one-to-one. If 
the action of G is not faithful, the set of g G G that act as the identity 
map forms an invariant subgroup H of G, and the quotient group G/H 
has a faithful action, 
iv) The action is free if the existence of an x such that gx = x implies that 
g = e. In this case, we also say that g acts without fixed points. 
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If the group acts freely and transitively, then having chosen a fiducial 
point xq, we can uniquely label every point in X by the group element g 
such that x = gxo- (If gi and g 2 both take xq — > x, then g^giXo = #o- By 
the free action property we deduce that g^g 2 = e, and g\ = g2-). In this 
case we might, for some purposes, identify X with G. 

Suppose the group acts transitively, but not freely. Let H be the set 
of elements that leaves xq fixed. This is clearly a subgroup of G, and if 
9iXo — 92Xq we have g± l g 2 G H, or g\H = g 2 H . The space X can therefore 
be identified with the space of cosets G/H. Such sets are called quotient 
spaces or Homogeneous spaces. Many spaces of significance in physics can be 
though of as cosets in this way. 

Example: The rotation group SO (3) acts transitively on the two-sphere S 2 . 
The SO (2) subgroup of rotations about the z axis, leaves the north pole of 
the sphere fixed. We can therefore identify S 2 ~ SO(3)/SO(2). 

Many phase transitions are a result of spontaneous symmetry breaking. 
For example the water — > ice transition results in the continuous translation 
invariance of the liquid water being broken down to the discrete translation 
invariance of the crystal lattice of the solid ice. When a system with symme- 
try group G spontaneously breaks the symmetry to a subgroup H, the set 
of inequivalent ground states can be identified with the homogeneous space 
G/H. 

5.2 Representations 

An n-dimensional representation of a group is formally defined to be a homo- 
morphism from G to a subgroup of GL(n, C), the group of invertible n-by-n 
matrices with complex entries. In effect, it is a set of n-by-n matrices that 
obeys the group multiplication rules 

D( 9l )D(g 2 ) = D( 9l g 2 ), D^ 1 ) = [D{g)]-\ (5.2) 

Given such a representation, we can form another one D\g) by conjuga- 
tion with any fixed invertible matrix C 

D'(g) = C-'D^C. (5.3) 

If D'(g) is obtained from D(g) in this way, we say that they are equivalent 
representations and write D ~ D' . We can think of D and D' as being 
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matrices representing the same linear map, but in different bases. Our task 
in the rest of this chapter is to find and classify all representations of a finite 
group G up to equivalence. 

Real and pseudo-real representations 

We can form a new representation from D{g) by setting 

D'(g) = D*(g), 

where D*(g) denotes the matrix whose entries are the complex conjugates 
of those in D(g). Suppose D* ~ D. It may then be possible to find a 
basis in which the matrices have only real entries. In this case we say the 
representation is real. It may be, however, be that D* ~ D but we cannot 
find a basis in which the matrices become real. In this case we say that D is 
pseudo-real. 

Example: Consider the defining representation of SU(2) (the group of 2-by-2 
unitary matrices with unit determinant.) Such matrices are necessarily of 
the form 

u=(: :fV (5.4) 



where a and b are complex numbers with |a| 2 + \b\ 2 = 1. They are there- 
fore specified by three real parameters, and so the group manifold is three 
dimensional. Now 

-b* 



b* a 

1\ fa -6* WO -1 
-I OJ \ b a* j\l 

-1 V 1 (a -b*\ f -1 



1 \b a* \ 1 



(5.5) 



and so U ~ U*. It is not possible to find a basis in which all SU(2) matrices 
are simultaneously real, however. If such a basis existed we could specify the 
matrices by only two real parameters — but we have seen that we need three 
real numbers to describe all possible SU(2) matrices. 
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Direct Sum and Direct Product 

We can obtain new representations from old by combining them. 

Given two representations D^(g), D^ 2 \g), we can form their direct sum 
© as the block-diagonal matrix 

( D(1){9) ° ] (5 6) 

We are particularly interested in taking a representation and breaking it up 
as a direct sum of irreducible representations. 

Given two representations D^(g), D^ 2 \g), we can combine them in a 
different way by taking their direct product Cg> D^ 2 \ the natural action 
of the group on the tensor product of the representation spaces. In other 
words, if D { - 1 \g)ef = e^Df^g) and D^(g)ef ] = e^Df/^g) we define 

[£>« ® DW](g)(eV ® ef ) = (e« ® eV)D<£(g)D%\g). (5.7) 

We think of D^)(g)D^\g) being the entries in the direct-product matrix 
matrix 

[D {1 \g)®D i2 \g)] kl ^ 

whose rows and columns are indexed by pairs of numbers. The dimension of 
the product representation is therefore the product of the dimensions of its 
factors. 

Exercise 5.9: Show that if D(g) is a representation, then so is 

D'(g) = [D(g- l )] T , 
where the superscript T denotes the transposed matrix. 

Exercise 5.10: Show that a map that assigns every element of a group G to 
the 1-by-l identity matrix is a representation. It is, not unreasonably, called 
the trivial representation. 

Exercise 5.11: A representation D : G — > GL(n,C) that assigns an element 
g G G to the n-by-n identity matrix I n if and only if g = e is said to be 
faithful. Let D be a non-trivial, but non-faithful, representation of G by n- 
by-n matrices. Let H C G consist of those elements h such that D(h) = I n . 
Show that H is a normal subgroup of G, and that D projects to a faithful 
representation of the quotient group G/H. 
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Exercise 5.12: Let A and B be linear maps from U — > U and C and D be 
linear maps from V — > V. Then the direct products ^4 © C and B ® D are 
linear maps from £/ © V — > f/ © V. Show that 

(A ®C)(B®D) = (AB) © (CD). 

Show also that 

(A © C)(B © D) = {AB) © (CD). 

Exercise 5.13: Let ^4 and B be m-by-m and n-by-n matrices respectively, and 
let J n denote the n-by-n unit matrix. Show that: 

i) ti(A® B) = ti(A) + tr(B). 

ii) tr(A®B) =tr(A)tr(B). 

hi) exp(A © B) = exp(A) © exp(B). 

iv) exp(A ® I n + I m ® B) = exp(A) © exp(B). 

v) det(A ®B) = det(A) det(B). 

vi) det(A®B) = (det(^)) n (det(B)) m . 



5.2.1 Reducibility and Irreducibility 

The "atoms" of representation theory are those representations that cannot, 
by a clever choice of basis, be decomposed into, or reduced to, a direct sum 
of smaller representations. Such a representation is said to be irreducible. It 
is not easy to tell by just looking at a representation whether is is reducible 
or not. We need to develop some tools. We begin with a more powerful 
definition of irreducibilty. 

We first introduce the notion of an invariant subspace. Suppose we have 
a set {A a } of linear maps acting on a vector space V. A subspace U C V 
is an invariant subspace for the set if x e U =3- A a x e U for all A a . 
The set {A a } is irreducible if the only invariant subspaces are V itself and 
{0}. Conversely, if there is a non-trivial invariant subspace, then the set 4 of 
operators is reducible. 

If the A a 's posses a non-trivial invariant subspace U, and we decompose 
V = U (&U' , where U' is a complementary subspace, then, in a basis adapted 
to this decomposition, the matrices A a take the block-partitioned form of 
figure 5.1. 



4 Irreducibility is a property of the set as a whole. Any individual matrix always has a 
non-trivial invariant subspace because it possesses at least one eigenvector. 
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Figure 5.1: Block partitioned reducible matrices. 

If we can find a 5 complementary subspace U' which is also invariant, then 
we have the block partitioned form of figure 5.2. 
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Figure 5.2: Completely reducible matrices. 

We say that such matrices are completely reducible. When our linear op- 
erators are unitary with respect to some inner product, we can take the 
complementary subspace to be the orthogonal complement. This, by unitar- 
ity, is automatically be invariant. Thus, unitarity and reducibility implies 
complete reducibility. 



Schur's Lemma 

The most useful results concerning irreducibility come from: 
Schur's Lemma: Suppose we have two sets of linear operators A a : U — > U, 
and B a : V — > V, that act irreducibly on their spaces, and an intertwining 
operator A : U — > V such that 



AA a = B a A, 



(5.8) 



for all a, then either 
a) A = 0, 
or 



5 Remember that complementary subspaces are not unique. 
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b) A is 1-1 and onto (and hence invertible), in which case U and V have 
the same dimension and A a = A~ 1 B a A. 
The proof is straightforward: The relation (5.8 ) shows that Ker (A) C U and 
Im(A) C V are invariant subspaces for the sets {A a } and {B a } respectively. 
Consequently, either A = 0, or Ker (A) = {0} and Im(A) = V. In the latter 
case A is 1-1 and onto, and hence invertible. 

Corollary: If {A a } acts irreducibly on an n-dimensional vector space, and 
there is an operator A such that 

AA a = A a A, (5.9) 

then either A = or A = XI. To see this observe that (5.9) remains true if 
A is replaced by (A — xl). Now det (A — xl) is a polynomial in x of degree 
n, and, by the fundamental theorem of algebra, has at least one root, x = X. 
Since its determinant is zero, (A — XI) is not invertible, and so must vanish 
by Schur's lemma. 

5.2.2 Characters and Orthogonality 

Unitary Representations of Finite Groups 

Let G be a finite group and let g h- > D(g) be a representation of G by matrices 
acting on a vector space V. Let (x, y) denote a positive-definite, conjugate- 
symmetric, sesquilinear inner product of two vectors in V. From ( , ) we 
construct a new inner product ( , ) by averaging over the group 

g<=G 

It is easy to see that this new inner product remains positive definite, and in 
addition has the property that 

(D(g)x,D(g)y) = (x,y). (5.11) 

This means that the maps D(g) : V — > V are unitary with respect to the 
new product. If we change basis to one that is orthonormal with respect to 
this new product then the D(g) become unitary matrices, with D(g~ r ) = 
D^ 1 (g) = D\g), where D\-{g) = D^g) denotes the conjugate-transposed 
matrix. 
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We conclude that representations of finite groups can always be taken 
to be unitary. This leads to the important consequence that for such rep- 
resentations reducibility implies complete reducibility. Warning: In this 
construction it is essential that the sum over the g G G converge. This is 
guaranteed for a finite group, but may not work for infinite groups. In par- 
ticular, non-compact Lie groups, such as the Lorentz group, have no finite 
dimensional unitary representations. 

Orthogonality of the Matrix Elements 

Now let D J (g) : Vj — > Vj be the matrices of an irreducible representation 
or irrep. Here J is a label which distinguishes inequivalent irreps from one 
another. We will use the symbol dim J to denote the dimension of the rep- 
resentation vector space Vj. 

Let D K be an irrep that is either identical to D J or inequivalent, and let 
Mij be a matrix possessing the appropriate number of rows and columns for 
product D J MD K to be defined, but otherwise arbitrary. The sum 

A = Y,D J {g- 1 )MD K {g) (5.12) 

geG 

obeys D J (g)A = AD K (g) for any g. Consequently, Schur's lemma tells us 
that 

A u = Y^D^g-^M^ig) = X(M)S a S JK . (5.13) 

g&G 

We have written A(M) to stress that the number A depends on the chosen 
matrix M. Now take M to be zero everywhere except for one entry of unity 
in row j column k. Then we have 

Y. D ^- 1 )DU9) = \Au5 JK (5.14) 

g&G 

where we have relabelled A to indicate its dependence on the location (j, k) 
of the non-zero entry in M. We can find the constants Xjk by assuming that 
K = J, setting i — I, and summing over i. We find 

\G\5jk = Xjk dim J. (5.15) 

Putting these results together we find that 

I^E^GT 1 )^) = (dimJ)- 1 ^^. (5.16) 
|G| gee 
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When our matrices D(g) are unitary, we can write this as 

|4E ( D i(9)YD%(g) = {AimJ)-% k 5 3l 5 JK . (5.17) 
1 1 geG 

If we consider complex-valued functions G — > C as forming a vector space, 
then the Df- are elements of this space and are mutually orthogonal with 
respect to its natural inner product. 

There can be no more orthogonal functions on G than the dimension of 
the function space itself, which is \G\. We therefore have a constraint 

^ (dim J) 2 < \G\ (5.18) 
j 

that places a limit on how many inequivalent representations can exist. In 
fact, as you will show later, the equality holds: the sum of the squares of the 
dimensions of the inequivalent irreducible representations is equal to the or- 
der of G, and consequently the matrix elements form a complete orthonormal 
set of functions on G. 



Class functions and characters 

Because 

ti (C- 1 DC) =tiD, (5.19) 

the trace of a representation matrix is the same for equivalent representations. 
Further, because 

tTD(g?g 9l ) = tr {D'\ gi )D{g)D{ gi )) = tvD(g), (5.20) 

the trace is the same for all group elements in a conjugacy class. The char- 
acter, 

xiff)=trD(g), (5.21) 

is therefore said to be a class function. 

By taking the trace of the matrix-element orthogonality relation we see 
that the characters x J — tr D J of the irreducible representations obey 

]^| £ U J (9)Y X K (9) = |^| £ * (xtT Xf = 5 JK i (5-22) 

1 1 geG 1 1 i 
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where di is the number of elements in the i-ih conjugacy class. 

The completeness of the matrix elements as functions on G implies that 
the characters form a complete orthogonal set of functions on the space of 
conjugacy classes equipped with inner product 

(x\x 2 ) = T^J2 d *(xlYxl (5.23) 

i 

Conseqently there are exactly as many inequivalent irreducible representa- 
tions as there are conjugacy classes in the group. 

Given a reducible representation, D(g), we can find out exactly which 
irreps J it contains, and how many times, rij, they occur. We do this forming 
the compound character 

X (g)=tiD(g) (5.24) 
and observing that if we can find a basis in which 

D(g) = (D\g) © D\g) © ■ • •) © (D 2 (g) © D 2 (g) ©•••)©••• , (5.25) 

N v ' N v ' 

ni terms ni terms 

then 

x(g) = n lX 1 (g) + n 2X 2 (g) + --- (5.26) 

From this we find 

nj = (X, X J ) = t^J2 d * W ( 5 - 27 ) 

i 

There are extensive tables of group characters. Table 5.2 shows, for ex- 
ample, the characters of the group S4 of permutations on 4 objects. 







Typical element and class size 




(1) 


(12) 


(123) 


(1234) (12)(34) 


Irrep 


1 


6 


8 


6 3 


M 


1 


1 


1 


1 1 


A 2 


1 


-1 


1 


-1 1 


E 


2 





-1 


2 


Tx 


3 


1 





-1 -1 


T 2 


3 


-1 





1 -1 



Table 5.2: Character table of S4 
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Since x J { e ) = dim J we see that the irreps A\ and A 2 are one dimensional, 
that E is two dimensional, and that T\^ are both three dimensional. Also 
we confirm that the sum of the squares of the dimensions 

1 + 1 + 2 2 + 3 2 + 3 2 = 24 = 4! 

is equal to the order of the group. 

As a further illustration of how to read table 5.2, let us verify the or- 
thonormality of the characters of the representations 7\ and T 2 . We have 

(X T \X T2 ) = j^i I> (xJ'Yx? = ^[l-3-3-6-M+8-0-0-6-M+3-M] = 0, 

i 

while 

( X Tl ,X Tl ) = |4l> {xJ'TxJ 1 = ^[l-3-3+6-M+8-0-0+6-M+3-M] = 1. 

i 

The sum giving (x T2 , X T2 ) = 1 is identical to this. 

Exercise 5.14: Let D 1 and D 2 be representations with characters x l {d) an d 
X 2 {g) respectively. Show that the character of the direct product representa- 
tion D l (g) D 2 is given by 

x m2 (g) = x l {g)x 2 {g). 



5.2.3 The Group Algebra 

Given a finite group G, we construct a vector space C(G) whose basis vectors 
are in one-to-one correspondence with the elements of the group. We denote 
the vector corresponding to the group element g by the boldface symbol g. 
A general element of C(G) is therefore a formal sum 

x = xigx + x 2 g2 H hX| G |g| G |. (5.28) 

We take products of these sums by using the group multiplication rule. If 
9i92 = 9z we set gig2 = g3, and require the product to be distributive with 
respect to vector-space addition. Thus 



gx = rriggi + x 2 gg 2 + 



• + Z|G|gg|G|- 



(5.29) 
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The resulting mathematical structure is called the group algebra. It was 
introduced by Frobenius. 

The group algebra, considered as a vector space, is automatically a rep- 
resentation. We define the natural action of G on C(G) by setting 



The matrices Dji(g) make up the regular representation. Because the list 
g gi, g g2, . . . is a permutation of the list gi, g2, • • •, their entries consist of l's 
and O's, with exactly one non-zero entry in each row and each column. 

Exercise 5.15: Show that the character of the regular representation has x(e) = 
\G\, and x(g) = 0, for g / e. 

Exercise 5.16: Use the previous exercise to show that the number of times 
an n dimensional irrep occurs in the regular representation is n. Deduce that 
|G| = ^j(dim J) 2 , and from this construct the completeness proof for the 
representations and characters. 

Projection Operators 

A representation D J of the group automatically provides a representation of 
the group algebra. We simply set 



Certain linear combinations of group elements turn out to be very useful 
because the corresponding matrices can be used to project out vectors with 
desirable symmetry properties. 
Consider the elements 



D(g)gi = g& = SjDji(g). 



(5.30) 




(5.31) 




dim J 



(5.32) 



of the group algebra. These have the property that 




dim J 




dim J 
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geG 



= e J lp D J ia { 9l ). (5.33) 

In going from the first to the second line we have changed summation vari- 
ables from g — > g± 1 g, and going from the second to the third line we have 
used the representation property to write D J (g^g) = D J (g^ l )D J (g). 

From gie^ = e^D^ a {gi) and the matrix-element orthogonality, it fol- 
lows that 



\G\ 



geG 
gee 



= 5 JK 5 ae 5^ef 5 

= 5 JK 5 01 e J a5 . (5.34) 

For each J, this multiplication rule of the e J a/3 is identical to that of matrices 
having zero entries everywhere except for the (a, /3)-th, which is a "1." There 
are (dim J) 2 of these e J a[3 for each n-dimensional representation J, and they 
are linearly independent. Because ^j(dim J) 2 = |G|, they form a basis for 
the algebra. In particular every element of G can be reconstructed as 

g = X^£-(3)ey- (5.35) 
j 

We can also define the useful objects 

P J = £^ = ^£[*WS- (5-36) 

i 1 1 geG 



They have the property 

p jpK = 6 Jk p k^ P J = I, (5.37) 



J 



where I is the identity element of C(G). The P are therefore projection 
operators composing a resolution of the identity. Their utility resides in the 
fact that when D(g) is a reducible representation acting on a linear space 



V = @Vj, (5.38) 



5.2. REPRESENTATIONS 



191 



then setting g — > D(g) in the formula for P J results in a projection matrix 
from V onto the irreducible component Vj. To see how this comes about, let 
vGV and, for any fixed p, set 

v, = ejv, (5.39) 
where e^v should be understood as shorthand for D(ef p )v. Then 

D((?)vj = ge^v = 4,vPj,an = VjDji(g). (5.40) 

We see the Vj, if not all zero, are basis vectors for Vj. Since P J is a sum of 
the efj, the vector P J v is a sum of such vectors, and therefore lies in Vj. The 
advantage of using P J over any individual ef is that P J can be computed 
from character table, i.e. its construction does not require knowledge of the 
irreducible representation matrices. 

The algebra of classes 

If a conjugacy class C, consists of the elements {<7i, <72> • • • 9^}, we can define 
Cj to be the corresponding element of the group algebra: 

C i = ^-(gi + g 2 + ---gd i ). (5.41) 

(The factor of is a conventional normalization.) Because conjugation 
merely permutes the elements of a conjugacy class, we have g _1 Cjg = Q 
for all g G C(G). The Q therefore commute with every element of C(G). 
Conversely any element of C(G) that commutes with everything in C(G) 
must be a linear combination C = c±Ci + C2C2 + . . .. The subspace of C(G) 
consisting of sums of the classes is therefore the centre Z[C(G)} of the group 
algebra. Because the product CjCj commutes with everything, it lies in 
Z[C(G)] and so there are constants Cij k such that 

C^-^c/C,. (5.42) 

k 

We can regard the Cj as being linear maps from Z[C(G)] to itself, whose 
associated matrices have entries (Cj) fe ■ = Cij k . These matrices commute, 
and can be simultaneously diagonalized. We will leave it as exercise for the 
reader to demonstrate that 

C t P J = P J . (5.43) 
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Here xi = xf e } = dim J ■ The common eigenvectors of the C, are therefore 
the projection operators P J , and the eigenvalues \f — xf /xi are > U P to nor ~ 
malization, the characters. Equation (5.43) provides a convenient method 
for computing the characters from knowledge only of the coefficients 
appearing in the class multiplication table. Once we have found the eigen- 
values A/, we recover the x{ by noting that Xo is re& l an d positive, and that 
£i*ltfl 2 = |G|. 

Exercise 5.17: Use Schur's lemma to show that for an irrep D J (g) we have 
and hence establish (5.43). 



5.3 Physics Applications 
5.3.1 Quantum Mechanics 

When a group G — acts on a mechanical system, then G will act as set of 
linear operators D(g) on the Hilbert space 7i of the corresponding quantum 
system. Thus H will be a representation 6 space for G. If the group is a 
symmetry of the system then the D(g) will commute with the hamiltonian 
H. If this is so, and if we can decompose 

H = Hj (5.44) 

irreps J 

into if-invariant irreps of G then Schur's lemma tells us that in each Hj the 
hamiltonian H will act as a multiple of the identity operator. In other words 
every state in Tij will be an eigenstate of H with a common energy Ej. 

This fact can greatly simplify the task of finding the energy levels. If 
an irrep J occurs only once in the decomposition of 7i then we can find the 
eigenstates directly by applying the projection operator P J to vectors in H. 

6 The rules of quantum mechanics only require that D{gi)D(g2) = e 1 ^ 91 ' 9 ^ D(g\g2). 
A set of matrices that obeys the group multiplication rule "up to a phase" is called a 
projective (or ray) representation. In many cases, however, we can choose the D(g) so 
that is not needed. This is the case in all the examples we discuss. 
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If the irrep occurs rij times in the decomposition, then P J will project to the 
reducible subspace 

Hj@H J @---'Hj y =M®7ij. 

rij copies 

Here M. is an rij dimensional multiplicity space. The hamiltonian H will act 
in M. as an nj-bj-rij matrix. In other words, if the vectors 

\n, i) = \n) ®\i) e M®Hj (5.45) 

form a basior M. <S> Hj, with n labelling which copy of Hj the vector \n,i) 
lies in, then 

H\n,i) = \m,i)H^ n , 
D(g)\n,i) = n. .,)!)■;,[,,). (5.46) 

Diagonalizing H^ m provides us with rij ^/-invariant copies of Hj and gives 
us the energy eigenstates. 

Consider, for example, the molecule Ceo (buckminsterfullerine) consisting 
of 60 carbon atoms in the form of a soccer ball. The chemically active 
electrons can be treated in a tight-binding approximation in which the Hilbert 
space has dimension 60 — one 7r-orbital basis state for each each carbon atom. 
The geometric symmetry group of the molecule is = Y x Z 2 , where Y is 
the rotational symmetry group of the icosohedron (a subgroup of SO (3)) and 
Z2 is the parity inversion a : r 1— > — r. The characters of Y are displayed in 
table 5.3. 





Typical element and class size 


Y 


e 


c 5 


ci 


c 2 


c 3 


Irrep 


1 


12 


12 


15 


20 


A 


1 


1 


1 


1 


1 


T-y 


3 


T- 1 


— T 


-1 





T 2 


3 


— T 


T- 1 


-1 





G 


4 


-1 


-1 





1 


H 


5 








1 


-1 



Table 5.3: Character table for the group Y. 



194 



CHAPTER 5. GROUPS AND GROUP REPRESENTATIONS 



In this table r = |(v5 — 1) denotes the golden mean. The class C5 is the 
set of 2tt/5 rotations about an axis through the centres of a pair of antipodal 
pentagonal faces, the class C3 is the set of of 2tc/3 rotations about an axis 
through the centres of a pair of antipodal hexagonal faces, and C2 is the set 
of 71 rotatations through the midpoints of a pair of antipodal edges, each 
lying between two adjacent hexagonal faces. 




Figure 5.3: A sketch of the tight- binding electronic energy levels of Cqq. 

The geometric symmetry group acts on the 60- dimensional Hilbert space by 
permuting the basis states concurrently with their associated atoms. Figure 
5.3 shows how the 60 states are disposed into energy levels. 7 Each level is 
labelled by a lower case letter specifying the irrep of Y, and by a subscript 
g or u standing for gerade (German for even) or ungerade (German for odd) 
that indicates whether the wavefunction is even or odd under the inversion 
o~ '. t 1 — > — r. 

The buckyball is roughly spherical, and the lowest 25 states can be 
thought as being derived from the angular-momentum eigenstates with L = 
0, 1, 2, 3, 4, that classify the energy levels for an electron moving on a perfect 
sphere. In the many-electron ground-state, the 30 single-particle states with 
energy below E < are each occupied by pairs of spin up/down electrons. 
The 30 states with E > are empty. 

7 After R. C. Haddon, L. E. Brus, K. Raghavachari, Chem. Phys. Lett. 125 (1986) 459. 
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To explain, for example, why three copies of 7\ appear, and why two 
of these are T\ u and one we must investigate the manner in which the 
60-dimensional Hilbert space decomposes into irreducible representations of 
120-element group Y h . Problem 5.23 leads us through this computation, and 
shows that no irrep of Y h occurs more that three times. In finding the energy 
levels, we therefore never have to diagonalize a bigger than 3-by-3 matrix. 

The equality of the energies of the h g and g g levels at E = —1 is an 
accidental degeneracy. It is not required by the symmetry, and will presum- 
ably disappear in a more sophisticated calculation. The appearance of many 
"accidental" degeneracies in an energy spectrum hints that there may be a 
hidden symmetry that arises from something beyond geometry. For example, 
in the Schrodinger spectrum of the hydrogen atom all states with the same 
principal quantum number n have the same energy although they correspond 
to different irreps L — 1, . . . , n — 1 of 0(3). This degeneracy occurs because 
the classical Kepler-orbit problem has symmetry group 0(4), rather than the 
naively expected 0(3) rotational symmetry. 

5.3.2 Vibrational spectrum of H2O 

The small vibrations of a mechanical system with n degrees of freedom are 
governed by a Lagrangian of the form 

L = -x T Mx - ix T Vx (5.47) 

where M and V are symmetric n-by-n matrices, and with M being positive 
definite. This Lagrangian leads to the equations of motion 

Mx = Vx (5.48) 

We look for normal mode solutions x(t) oc e JaJi *Xj, where the vectors Xj obey 

-uojM^ = Vxi. (5.49) 

The normal-mode frequencies are solutions of the secular equation 

det (V - uj 2 M) = 0, (5.50) 

and modes with distinct frequencies are orthogonal with respect to the inner 
product defined by M, 

(x,y) = x T My. (5.51) 
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We are interested in solving this problem for vibrations about the equi- 
librium configuration of a molecule. Suppose this equilibrium configuration 
has a symmetry group G. This gives rise to an n- dimensional representation 
on the space of x's in which 

g : x » D(g)x, (5.52) 
leaves both the intertia matrix M and the potential matrix V unchanged. 

[D(g)] T MD(g) = M, [D(g)] T VD(g) = V. (5.53) 
Consequently, if we have an eigenvector Xj with frequency Ui, 

-u 2 M^ = Vxi (5.54) 

we see that D(g)xi also satisfies this equation. The frequency eigenspaces 
are therefore left invariant by the action of D(g), and barring accidental 
degeneracy, there will be a one-to-one correspondence between the frequency 
eigenspaces and the irreducible representations occurring in D(g). 

Consider, for example, the vibrational modes of the water molecule H2O. 
This familiar molecule has symmetry group Civ which is generated by two 
elements: a rotation a through 7r about an axis through the oxygen atom, 
and a reflection b in the plane through the oxygen atom and bisecting the 
angle between the two hydrogens. The product ab is a reflection in the plane 
defined by the equilibrium position of the three atoms. The relations are 
a 2 = b 2 = (ab) 2 = e, and the characters are displayed in table 5.4. 





class and 


size 




e 


a 


b 


ab 


Irrep 


1 


1 


1 


1 


A 1 


1 


1 


1 


1 


A 2 


1 


1 


-1 


-1 


B l 


1 


-1 


1 


-1 


B 2 


1 


-1 


-1 


1 



Table 5.4: Character table of C 2v - 
The group C 2v is Abelian, so all the representations are one dimensional. 
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To find out what representations occur when C<i v acts, we need to find 
the character of its action D(g) on the nine-dimensional vector 

x= (x ,yo,zo,x Bl ,y Hl ,z Hl ,x Ha ,yH 2 ,z Ha )- (5.55) 

Here the coordinates xh 2 iVh 2 i z h 2 e ^ c - denote the displacements of the la- 
belled atom from its equilibrium position. 

We take the molecule as lying in the xy plane, with the z pointing towards 

us. 




Figure 5.4: Water Molecule. 



The effect of the symmetry operations on the atomic displacements is 

D(a)x = {-XOt+yOi-ZOi-XHv+yHaj-ZHst-X^j+yHv-ZH!) 
D(6)x = (-Xo,+yo,+Zo,-XH 2 ,+yH 2 ,+ZH 2 ,-XHi,+yH 1 ,+ZHi) 

D(ab)x = {+x ,+yo,-zo,+XHt,+yHx,-ZH 1 ,+XH 2 ,+yH 2 ,-ZH 2 )- 

Notice how the transformations D(a), D(b) have interchanged the displace- 
ment co-ordinates of the two hydrogen atoms. In calculating the character 
of a transformation we need look only at the effect on atoms that are left 
fixed — those that are moved have matrix elements only in non-diagonal 
positions. Thus, when computing the compound characters for a b, we can 
focus on the oxygen atom. For ab we need to look at all three atoms. We 
find 

X D (e) = 9, 

X D (a) = -1 + 1 - 1 = -1, 
X D (b) = -1 + 1 + 1 = 1, 
x D (ab) = 1 + 1-1 + 1 + 1-1 + 1 + 1-1 = 3. 
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By using the orthogonality relations, we find the decomposition 

/ 9 \ 

-1 
1 

V 3 ) 



or 



/1\ 




( 1 ^ 




( 1 ^ 




( 1 \ 


1 
1 


+ 


1 

-1 


+ 2 


-1 
1 


+ 3 


-1 
-1 


w 




V- 1 / 




V- 1 / 




V 1 / 


x D - 




41 + x 


A *+2 X Bl +3x B2 - 





Thus, the nine-dimensional representation decomposes as 

D = 3Ai © A 2 © 2B 1 © 3.02. 



(5.56) 

(5.57) 
(5.58) 



How do we exploit this? First we cut out the junk. Out of the nine 
modes, six correspond to easily identified zero-frequency motions - three of 
translation and three rotations. A translation in the x direction would have 
xo — x Hi = xh 2 = £,i a U other entries being zero. This displacement vector 
changes sign under both a and b, but is left fixed by ab. This behaviour 
is characteristic of the representation E> 2 . Similarly we can identify A\ as 
translation in y, and B\ as translation in z. A rotation about the y axis 
makes zh 1 = —zh 2 = <fi. This is left fixed by a, but changes sign under b and 
ab, so the y rotation mode is A 2 . Similarly, rotations about the x and z axes 
correspond to B\ and B 2 respectively. All that is left for genuine vibrational 
modes is 2Ai © B 2 . 

We now apply the projection operator 

P Al = \[{ X M {e)YD{e) + ( X Al (a)TD(b) + ( X A ^b)yD(b) + ( X Al (ab)T D (ab)] 

(5.59) 

to vh 1 ,x, a small displacement of Hi in the x direction. We find 



(Vffi^ — v H 2 ,x — Vh 2 ,x + Vh 1jX ) 



(5.60) 



This mode is an eigenvector for the vibration problem. 
If we apply P Al to v^ liS/ and \ Q , y we find 

1 , 



V H 2 ,y), 



Vo. 



(5.61) 
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but we are not quite done. These modes are contaminated by the y trans- 
lation direction zero mode, which is also in an A\ representation. After 
we make our modes orthogonal to this, there is only one left, and this has 
yu x = Vh 2 — —yo m o/i^ m H) = oi, all other components vanishing. 
We can similarly find vectors corresponding to B 2 as 

P B2 V ,x = Vq, x 

and these need to be cleared of both translations in the x direction and 
rotations about the z axis, both of which transform under B 2 . Again there 
is only one mode left and it is 

y Hl = -y H , 2 = ax Hl = ax H , 2 = (3x = a 2 (5.62) 

where a is chosen to ensure that there is no angular momentum about O, 
and f3 to make the total x linear momentum vanish. We have therefore 
found three true vibration eigenmodes, two transforming under A 1 and one 
under B 2 as advertised earlier. The eigenfrequencies, of course, depend on 
the details of the spring constants, but now that we have the eigenvectors we 
can just plug them in to find these. 

5.3.3 Crystal Field Splittings 

A quantum mechanical system has a symmetry G if the hamiltonian H obeys 

D-\g)HD{g)=H, (5.63) 

for some group action D(g) : TL — > Ti on the Hilbert space. If follows that 
the eigenspaces, 7i\, of states with a common eigenvalue, A, are invariant 
subspaces for the representation D(g). 

We often need to understand how a degeneracy is lifted by perturbations 
that break G down to a smaller subgroup H. An n-dimensional irreducible 
representation of G is automatically a representation of any subgroup of G, 
but in general it is no longer be irreducible. Thus the n-fold degenerate 
level is split into multiplets, one for each of the irreducible representations 
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of H contained in the original representation. The manner in which an orig- 
inally irreducible representation decomposes under restriction to a subgroup 
is known as the branching rule for the representation. 

A physically important case is given by the breaking of the full SO (3) 
rotation symmetry of an isolated atomic hamiltonian by a crystal field Sup- 
pose the crystal has octohedral symmetry. The characters of the octohedral 
group are displayed in table 5.5. 






e 


C 3 (8) 


Class(size) 
Cj(3) C 2 (6) 


C 4 (6) 


A 1 


1 


1 


1 


1 


1 


A 2 


1 


1 


1 


-1 


-1 


E 


2 


-1 


2 








F 2 


3 





-1 


1 


-1 


F, 


3 





-1 


-1 


1 



Table 5.5: Character table of the octohedral group O. 



The classes are lableled by the rotation angles, C 2 being a twofold rotation 
axis (9 — 7r), C*3 a threefold axis (6 = 2n/3), etc.. 

The chacter of the J = I representation of SO (3) is 

sin(2Z + 1)0/2 
* W = sin 0/2 ■ (5 ' M > 

and the first few x''s evaluated on the rotation angles of the classes of O are 
dsiplayed in table 5.6. 









Class(size) 




/ 


e 


C 3 (8) 


Cf(3) 


C 2 (6) 


C 4 (6) 





1 


1 


1 


1 


1 


1 


3 





-1 


-1 


-1 


2 


5 


-1 


1 


1 


-1 


3 


7 


1 


-1 


-1 


-1 


4 


9 





1 


1 


1 



Table 5.6: Characters evaluated on rotation classes 
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The 9-fold degenerate I = 4 multiplet therefore decomposes as 

/9\ /1\ / 2 \ 

-1 
2 


V o J 





1 
1 

W 



i 
i 
i 

W 



( 3 \ 




3 \ 










-1 - 




-1 


-1 




1 


V i / 


V 


- 1 / 



or 



(5.65) 



(5.66) 



xso(s) = x 1 + x* + r 

The octohedral crystal field splits the nine states into four multiplets with 
symmetries Ai, E, Fi, F 2 and degeneracies 1, 2, 3 and 3, respectively. 

We have considered only the simplest case here, ignoring the complica- 
tions introduced by reflection symmetries, and by 2-valued spinor represen- 
tations of the rotation group. 



5.4 Further Exercises and Problems 

We begin with some technologically important applications of group theory 
to cryptography and number theory. 

Exercise 5.18: The set Z n forms a group under multiplication only when n is 
a prime number. Show, however, that the subset U(Z n ) C 7L n of elements of 
TL n that are co-prime to n is a group. It is the group of units of the ring Z n . 

Exercise 5.19: Cyclic groups. A group G is said to be cyclic if its elements 
consist of powers a n of of an element a, called the generator. The group will 
be of finite order |G| = m if a m = a? = e for some m € Z + . 

a) Show that a group of prime order is necessarily cyclic, and that any 
element other than the identity can serve as its generator. (Hint: Let 
a be any element other than e and consider the subgroup consisting of 
powers a m .) 

b) Show that any subgroup of a cyclic group is itself cyclic. 

Exercise 5.20: Cyclic groups and cryptography. In a large cyclic group G 
it can be relatively easy to compute a x , but to recover x given h = a x one 
might have to compute a y and compare it with h for every 1 < y < \G\. If 
|G| has several hundred digits, such a brute force search could take longer 
than the age of the universe. Rather more efficient algorithms for this discrete 
logarithm problem exist, but the difficulty is still sufficient for it to be useful 
in cryptopgraphy. 
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a) Diffie-Hellman key exchange. This algorithm allows Alice and Bob to 
establish a secret key that can be used with a conventional cypher with- 
out Eve, who is listening to their conversation, being able to reconstruct 
it. Alice choses a random element g G G and an integer x between 1 and 
|G| and computes g x . She sends g and g x to Bob, but keeps x to herself. 
Bob chooses an integer y and computes g y and g xy = (g x ) y . He keeps 
y secret and sends g y to Alice, who computes g xy = (g v ) x . Show that, 
although Eve knows g, g y and g x , she cannot obtain Alice and Bob's 
secret key g xy without solving the discrete logarithm problem. 

b) ElGamal public key encryption. This algorithm, based on Diffie-Hellman, 
was invented by the Egyptian cryptographer Taher Elgamal. It is a 
component of PGP and and other modern encryption packages. To use 
it, Alice first chooses a random integer x in the range 1 to \G\ and 
computes h = a x . She publishes a description of G, together with the 
elements h and a, as her public key. She keeps the integer x secret. To 
send a message m to Alice, Bob chooses an integer y in the same range 
and computes c\ = a y , C2 = mh y . He transmits c\ and C2 to Alice, but 
keeps y secret. Alice can recover m from c\, ci by computing C2(cf)~ 1 . 
Show that, although Eve knows Alice's public key and has overheard c\ 
and C2, she nonetheless cannot decrypt the message without solving the 
discrete logarithm problem. 

Popular choices for G are subgroups of (Z p ) x , for large prime p. (% p ) x is itself 
cyclic (can you prove this?), but is unsuitable for technical reasons. 

Exercise 5.21: Modular arithmetic and number theory. An integer a is said 
to be a quadratic residue mod p if there is an r such that a = r 2 (mod p). 
Let p be an odd prime. Show that if r\ = r\ (mod p) then r\ = ±r2 (mod p), 
and that r / — r (mod p). Deduce that exactly one half of the p — 1 non-zero 
elements of Z p are quadratic residues. 

Now consider the Legendre symbol 



and so the Legendre symbol forms a one-dimensional representation of the 
multiplicative group (Z p ) x . Combine this fact with the character orthogonality 




0, 

1, 
-1 



a = 0, 

a a quadratic residue (mod p), 
a not a quadratic residue (mod p) . 



Show that 
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theorem to give an alternative proof that precisely half the p — 1 elements of 
(Z p ) x are quadratic residues. (Hint: To show that the product of two non- 
residues is a residue, observe that the set of residues is a normal subgroup of 
(Z p ) x , and consider the multiplication table of the resulting quotient group.) 

Exercise 5.22: More practice with modular arithmetic. Again let p be an odd 
prime. Prove Euler's theorem that 

a^- 1 )/ 2 (mod p)= Q. 

(Hint: Begin by showing that the usual school-algebra proof that an equa- 
tion of degree n can have no more than n solutions remains valid for arith- 
metic modulo a prime number, and so a^ -1 ^ 2 = 1 (mod p) can have no more 
than(p — l)/2 roots. Cite Fermat's little theorem to show that these roots 
must be the quadratic residues. Cite Fermat again to show that the quadratic 
non-residues must then have a^ 1 ^ 2 = — 1 (mod p).) 

The harder-to-prove law of quadratic reciprocity asserts that for p, q odd primes, 
we have 

(-i)"- ,,, '- I,/4 © - (*)■ 

Problem 5.23: Buckyball spectrum. Consider the symmetry group of the Ceo 
buckyball molecule of figure 5.3. 

a) Starting from the character table of the orientation-preserving icosohe- 
dral group Y (table 5.3), and using the fact that the Z 2 parity inversion 
u : r — ► — r combines with g G Y so that D J o(ag) = D J 9(g), whilst 
D Ju (crg) = —D Ju (g), write down the character table of the extended 
group Yh = Y X Z2 that acts as a symmetry on the Cqq molecule. There 
are now ten conjugacy classes, and the ten representations will be la- 
belled A g , A u , etc. Verify that your character table has the expected 
row-orthogonality properties. 

b) By counting the number of atoms left fixed by each group operation, 
compute the compound character of the action of Y^ on the Cqo molecule. 
(Hint: Examine the pattern of panels on a regulation soccer ball, and 
deduce that four carbon atoms are left unmoved by operations in the 
class o~C2-) 

c) Use your compound character from part b), to show that the 60-dimensional 
Hillbert space decomposes as 

H C60 =A g ® T lg 2T lu © T 2g ® 2T 2u © 2G g © 2G U © 3H g © 2H U , 

consistent with the energy-levels sketched in figure 5.3. 
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Problem 5.24: The Frobenius-Schur Indicator. Recall that a real or pseudo- 
real representation is one such that D(g) ~ D*(g), and for unitary matrices D 
we have D*(g) = \D T {g)\~ 1 . In this unitary case D(g) being real or pseudo- 
real is equivalent to the statement that there exists an invertible matrix F 
such that 

FD(g)F- 1 = [D T (g)}-\ 

We can rewrite this statement as D (g)FD(g) = F, and so F can be inter- 
preted as the matrix representing a G-invariant quadratic form. 

i) Use Schur's lemma to show that when D is irreducible the matrix F is 
unique up to an overall constant. In other words, D T (g)FiD(g) = iq 
and D T (g)F 2 D{g) = F 2 for all g G G implies that F 2 = XF 1 . Deduce 
that for irreducible D we have F T = ±F. 

ii) By reducing F to a suitable canonical form, show that F is symmetric 
(F = F T ) in the case that D(g) is a real representation, and F is skew 
symmetric (F = —F T ) when D(g) is a pseudo-real representation. 

iii) Now let G be a finite group. For any matrix U, the sum 

Fu = T^J2 DT (9WD(g) 

is a G-invariant matrix. Deduce that Fjj is always zero when D{g) is 
neither real nor pseudo-real, and, by specializing both U and the indices 
on Fjj, show that in the real or pseudo-real case 

SGG g€G 

where x(<?) = trD(^) is the character of the irreducible representation 
D(g). Deduce that the Frobenius-Schur indicator 

takes the value +1, —1, or when D(g) is, respectively, real, pseudo-real, 
or not real. 

iv) Show that the identity representation occurs in the decomposition of the 
tensor product D(g) <g> D(g) of an irrep with itself if, and only if, D(g) 
is real or pseudo-real. Given a basis for the vector space V on which 
D(g) acts, show the matrix F can be used to construct the basis for the 
identity-representation subspace V ld in the decomposition 

v®v= V J . 

irreps J 
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Problem 5.25: Induced Representations. Suppose we know a representation 
D w (h) : W —> W for a subgroup H C G. From this representation we can 
construct an induced representation \nd ( f I {D w ) for the larger group G. The 
construction cleverly combines the coset space G/H with the representation 
space W to make a (usually reducible) representation space Indg(VF) of di- 
mension \G/H\ x dim IF. 

Recall that there is a natural action of G on the coset space G/H. If x = 
{<7i, <72, ■ ■ ■} G G/H then gx is the coset {ggi, ggi-, ■ ■ ■}■ We select from each 
coset x G G/iJ a representative element a x , and observe that the product <7a x 
can be decomposed as ga x = a gx h, where a gx is the selected representative 
from the coset gx and h is some element of H. Next we introduce a basis 
\n,x) for Ind^r(W). We use the symbol "0" to label the coset {e}, and take 
\n, 0) to be the basis vectors for W. For h G i? we can therefore set 

D(/0|n,0)^V,0)I^nW- 
We also define the result of the action of a x on \n, 0) to be the vector |n, x): 

D(a x )\n,0) d = \n,x). 

We may now obtain the the action of a general element of G on the vectors 
\n,x) by requiring D(g) to be representation, and so computing 

D(g)\n,x) = D(g)D(a x )\n,0) 

= D(ga x )\n,0) 

= D(a gx h)\n,0) 

= D(a gx )D(h)\n,0) 

= D(a gx )\m,0)DZ(h) 

= \m,gx)D% n (h). 

i) Confirm that the action D(g)\n,x) = \m, gx)D^ n (h), with h obtained 
from g and x via the decomposition ga x = a gx h, does indeed define a 
representation of G. Show also that if we set |/) = J2 n x fn(x)\n, x), 
then the action of g on the components takes 

f n {x) ^ DZnWUg-'x). 

ii) Let f(h) be a class function on H. Let us extend it to a function on G 
by setting f(g) = if g ^ H, and define 
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Show that lnd%[f]{s) is a class function on G, and further show that if 
Xw is the character of the starting representation for H then Ind^fxw] 
is the character of the induced representation of G. (Hint, only fixed 
points of the G-action on G/H contribute to the character, and gx = x 
means that ga x = a x h. Thus D w (h) = D w (a~ 1 ga x ).) 
iii) Given a representation D v (g) : V — > V of G we can trivially obtain a 
(generally reducible) representation Res^(F) of H C G by restricting G 
to H. Define the usual inner product on the group functions by 



and show that if ip is a class function on H and <j> a class function on G 



Thus, Ind^ and Res^ are, in some sense, adjoint operations. Mathe- 
maticians would call them a pair of mutually adjoint functors. 
iv) By applying the result from part (iii) to the characters of the irreducible 
representations of G and H, deduce Frobenius' reciprocity theorem: The 
number of times an irrep D J (g) of G occurs in the representation induced 
from an irrep D K (h) of H is equal to the number of times that D K occurs 
in the decomposition of D J into irreps of H. 

The representation of the Poincare group (= the SO(l,3) Lorentz group to- 
gether with space-time translations) that classifies the states of a spin-J ele- 
mentary particle are those induced from the spin-J representation of its SO (3) 
rotation subgroup. The quantum state of a mass m elementary particle is 
therefore of the form \k,o~) where k is the particle's four-momentum, which 
lies is the coset SO(l, 3)/SO(3), and a is the label from the \J, a) spin state. 




<?eG 



then 



(^Res^]) H = (Ind^],</>> 



Chapter 6 
Lie Groups 



Lie groups are named after the Norwegian mathematician Sophus Lie. They 
consist of a manifold G equipped with a group multiplication rule (gi, g 2 ) i— > g 3 
which is a smooth function of the g's, as is the operation of taking the inverse 
of a group element. The most commonly met examples in physics are the 
infinite families of matrix groups GL(n), SL(n), O(n), SO(n), U(n), SU(n), 
and Sp(n), togther with the family of five exceptional Lie groups: G 2 , F 4 , 
E 6 , E 7 , and E 8 , which have applications in string theory. 

One of the properties of a Lie group is that, considered as a manifold, 
the neighbourhood of any point looks exactly like that of any other. The 
group's dimension and most of its structure can be understood by examining 
the immediate vicinity any chosen point, which we may as well take to be 
the identity element. The vectors lying in the tangent space at the identity 
element make up the Lie algebra of the group. Computations in the Lie 
algebra are often easier than those in the group, and provide much of the 
same information. This chapter will be devoted to studying the interplay 
between the Lie group itself and this Lie algebra of infinitesimal elements. 

6.1 Matrix Groups 

The Classical Groups are described in a book with this title by Hermann 
Weyl. They are subgroups of the general linear group, GL(n, F), which con- 
sists of invertible n-by-n matrices over the field F. We will mostly consider 
the cases F = C or F = R. 

A near-identity matrix in GL(n, R) can be written g = I + eA where A 
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is an arbitrary n-bj-n real matrix. This matrix contains n 2 real entries, so 
we can move away from the identity in n 2 distinct directions. The tangent 
space at the identity, and hence the group manifold itself, is therefore n 2 
dimensional. The manifold of GL(n, C) has n 2 complex dimensions, and this 
corresponds to 2n 2 real dimensions. 

If we restrict the determinant of a GL(n, F) matrix to be unity, we get 
the special linear group, SL(n, F). An element near the identity in this group 
can still be written as g = I + eA, but since 

det(I + eA) = 1 + etr(A) + 0(e 2 ) (6.1) 

this requires tr(A) = 0. The restriction on the trace means that SL(n, R) 
has dimension n 2 — 1. 

6.1.1 The Unitary and Orthogonal Groups 

Perhaps the most important of the matrix groups are the unitary and or- 
thogonal groups. 



The Unitary group 

The unitary group U(n) comprises the set of n-by-n complex matrices U such 
that C/t = U~ l . If we consider matrices near the identity 

U = I + eA, (6.2) 

with e real, then unitarity requires 

l + 0(e 2 ) = (I + e A){I + eA^ 

= I + e(A + A^) + 0(e 2 ), (6.3) 

so Aij = —A* { and A is skew hermitian. A complex skew-hermit ian matrix 
contains 

n + 2 x -n(n — 1) = n 2 
2 

real parameters. In this counting the first "n" is the number of entries on 
the diagonal, each of which must be of the form i times a real number. The 
n(n — l)/2 is the number of entries above the main diagonal, each of which 
can be an arbitrary complex number. The number of real dimensions in the 
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group manifold is therefore n 2 . The rows or columns in the matrix U form 
an orthonormal set of vectors. Their entries are therefore bounded, \Uij\ < 1, 
and this property leads to the n 2 dimensional group manifold of U(n) being 
a compact set. 

When a group manifold is compact, we say that the group itself is a 
compact group. There is a natural notion of volume on a group manifold 
and compact Lie groups have finite total volume. Because of this, they have 
many properties in common with the finite groups we studied in the last 
chapter. 

Recall that a group is simple if it possesses no invariant subgroups. U(n) 
is not simple. Its centre is an invariant U(l) subgroup consisting of matrices 
of the form U = e ld I. The special unitary group SU(n), consists of n-bj-n 
unimodular (having determinant +1 ) unitary matrices. It is not strictly 
simple because its center Z consists of the discrete subgroup of matrices 
U m = uj m I with uj an n-th root of unity, and this is an invariant subgroup. 
Because Z, its only invariant subgroup, is not a continuous group, SU(ra) 
is counted as being simple in Lie theory. With U = I + eA, as above, the 
unimodularity imposes the additional constraint on A that tr A = 0, so the 
SU(n) group manifold is n 2 — 1 dimensional. 

The Orthogonal Group 

The orthogonal group O(n), consists of the the set of real matrices O with 
the property that T = 0~ l . For a matrix in the neighbourhood of the 
identity, = 1 + eA, this condition requires that A be skew symmetric: 
A^ = —Aij. Skew symmetric real matrices have n(n — l)/2 independent 
entries, and so the group manifold of O(n) is n(n — l)/2 dimensional. The 
condition T = I means that the rows or columns of O, considered as row 
or column vectors, are orthonormal. All entries are bounded \Oij\ < 1, and 
again this leads to O(n) being a compact group. 
The identity 

1 = det (0 T 0) = det T det O = (det O) 2 (6.4) 

tells us that det O = +1. The subset of orthogonal matrices with det O = +1 
constitute a subgroup of 0(n) called the special orthogonal group, SO(n). The 
unimodularity condition discards a disconnected part of the group manifold 
and does not reduce its dimension, which remains n(n — l)/2. 
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6.1.2 Symplectic Groups 

The symplectic groups (named from Greek meaning to "fold together") are 
probably less familiar than the other matrix groups. 

We start with a non-degenerate skew-symmetric matrix uj. The symplec- 
tic group Sp(2n, F) is then defined by 

Sp(2n, F) = {S e GL(2n, F) : S T uS = uj}. (6.5) 

Here F can be R or C. When F = C, we still use the transpose "T," not f, in 
this definition. Setting S = I 2n + tA and demanding that S T uS = uj shows 
that A T uo + uoA = 0. 

It does not matter what skew matrix uj we start from, because we can 
always find a basis in which uj takes its canonical form: 

-'«). (6.6) 

In this basis we find, after a short computation, that the most general form 
for A is 

-*■)■ (6 - 7) 

Here a is any n-by-n matrix, and b and c are symmetric ( b T = b and 
c T = c) n-by-n matrices. If the matrices are real, then counting the degrees 
of freedom gives the dimension of the real symplectic group as 

dim Sp(2n, R) = n 2 + 2 x -(n + 1) = n(2n + 1). (6.8) 

The entries in a, b, c can be arbitrarily large. Sp(2n, R) is not compact. 

The determinant of any symplectic matrix is +1. To see this take the 
elements of uj to be Uij, and let 

uj(x,y) =uj ij x t y J (6.9) 

be the associated skew bilinear (not sesquilinear) form . Then Weyl's identity 
from exercise ??.?? shows that 

Pf (uj) (det M) det . . .x 2n \ 

= 7^~l sgn ( 7r ) UJ ( Mx ^)^ Mx ^m) ' - ■ UJ ( Mx A2n-i),Mx 7v{2n) ), 

A Tl. 
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for any linear map M. If u(x,y) = u(Mx, My), we conclude that det M = 
1 - - but preserving oo is exactly the condition that M be an element of 
the symplectic group. Since the matrices in Sp(2n,F) are automatically 
unimodular there is no "special symplectic" group. 

Unitary Symplectic Group 

The intersection of two groups is also a group. We therefore define the unitary 
symplectic group as 

Sp(ra) = Sp(2n,C) nU(2ra). (6.10) 

This group is compact. We will see that its dimension is n(2n+ 1), the same 
as the non-compact Sp(2n, M). Sp(n) may also be defined as U(n,H) where 
EI denotes the skew field of quaternions. 

Warning: Physics papers often make no distinction between Sp(n), which 
is a compact group, and Sp(2n, R) which is non-compact. To add to the 
confusion the compact Sp(ra) is also sometimes called Sp(2n). You have to 
judge from the context what group the author has in mind. 
Physics Application: Kramers' degeneracy. Let C — ia 2 . Therefore 

C- x a n C = -a* n . (6.11) 

A time-reversal invariant Hamiltonian containing L ■ S spin-orbit interactions 
obeys 

C- l HC = H*. (6.12) 

If we regard the 2n-by-2n matrix H as being an n-by-n matrix whose entries 
Hij are themselves 2-by-2 matrices, which we expand as 

3 

71=1 

then the condition (6.12) implies that the are real numbers. We say 
that H is real quaternionic. This is because the Pauli sigma matrices are 
algebraically isomorphic to Hamilton's quaternions under the identification 



ib\ <-> i, 

162 <-> j, 

163 <-> k. 



(6.13) 
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The hermiticity of H requires that Hji = where the overbar denotes 
quaternionic conjugation 

q° + iq 1 ai + iq 2 a 2 + iq 3 os — > q° - iq l <J\ - iq 2 a 2 - iq 3 a 3 . (6.14) 

If Hip = Eifj, then HCip* = Eip*. Since C is skew, ip and Cip* are necessarily 
orthogonal. Therefore all states are doubly degenerate. This is Kramers' 
degeneracy. 

H may be diagonalized by a matrix in U(n, HI), where U(n, HI) consists 
of those elements of U(2n) that satisfy C~ l UC = U*. We may rewrite this 
condition as 

c- l uc = U* UCU T = C, 

so U(n, H) consists of the unitary matrices that preserve the skew matrix C. 
Thus U(n, H) C Sp(n). Further investigation shows that U(n, H) = Sp(n). 

We can exploit the quaternionic viewpoint to count the dimensions. Let 
U = I + eB be in U(n, HI), then Bij + Bji = 0. The diagonal elements of B are 
thus pure "imaginary" quaternions having no part proportional to /. There 
are therefore 3 parameters for each diagonal element. The upper triangle has 
n{n — l)/2 independent elements, each with 4 parameters. Counting up, we 
find 

n 

dimU(n,H) =dimSp(n) =3n + 4x -(n-1) =n(2n + l). (6.15) 

Thus, as promised, we see that the compact group Sp(n) and the non- 
compact group Sp(2n, R) have the same dimension. 

We can also count the dimension of Sp(n) by looking at our previous 
matrices 




where a b and c are now allowed to be complex, but with the restriction that 
S = I + eA be unitary. This requires A to be skew-hermitian, so a = —a) , 
and c = —b\ while b (and hence c) remains symmetric. There are n 2 free 
real parameters in a, and n(n + 1) in 6, so 

dim Sp(n) = (n 2 ) + n(n + 1) = n(2n + 1) 

as before. 
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Exercise 6. 1 : Show that 

SO(2iV) n Sp(2iV, R) U(JV). 

Hint: Group the 2N basis vectors on which 0(2N) acts into pairs x n and y n , 
n = 1, . . . , N. Assemble these pairs into z n = x n + iy n and z = x n — zy„. Let 
u; be the linear map that takes x n — ► y n and y„ — > — x n . Show that the subset 
of SO(2iV) that commutes with uj mixes Zj's only with Zj's and Zj's only with 

Zj's. 



6.2 Geometry of SU(2) 



To get a sense of Lie groups as geometric objects, we will study the simplest 
non-trivial case of SU(2) in some detail. 

A general 2-by-2 complex matrix can be parametrized as 

12 a 3 • (6-16) 

— X X — tX° J 

The determinant of this matrix is unity provided 

(x ) 2 + (x 1 ) 2 + (x 2 ) 2 + (x 3 ) 2 — 1. (6.17) 

When this condition is met, and if in addition the x l are real, the matrix is 
unitary: W = U -1 . The group manifold of SU(2) can therefore be identified 
with the three-sphere S 3 . We will take as local co-ordinates . When 

we desire to know x° we will find it from x° = yjl — (x 1 ) 2 — (x 2 ) 2 — (x 3 ) 2 . 
This co-ordinate chart only labels the points in the half of the three-sphere 
with x° > 0, but this is typical of any non-trivial manifold. A complete atlas 
of charts can be constructed if needed. 

We can simplify our notation by using the Pauli sigma matrices 

*-(! ;)• *-(! 7). -°0- « 6 - i8) 

These obey 

Uj] = 2ie ijk a k , and a h &j + = 25^1. (6.19) 
In terms of them, we can write 

g = U = x°I + ix l di + ix 2 a 2 + ix 3 a 3 . (6.20) 



214 



CHAPTER 6. LIE GROUPS 



Elements of the group in the neighbourhood of the identity differ from e = I 
by real linear combinations of the i&i. The three-dimensional vector space 
spanned by these matrices is therefore the tangent space TG e at the identity 
element. For any Lie group this tangent space is called the Lie algebra, 
g = LieG of the group. There will be a similar set of matrices i\ for any 
matrix group. They are called the generators of the Lie algebra, and satisfy 
commutation relations of the form 

\t\ % X\ 3 } = -f l] k (i\ k ), (6.21) 

or equivalently 

[A i ,A i ]=i/ y *A Jk (6.22) 

The /j- fe are called the structure constants of the algebra. The V's associ- 
ated with the A's in this expression are conventional in physics texts because 
for quantum mechanics application we usually desire the Aj to be hermitian. 
They are usually absent in books written for mathematicians. 

Exercise 6.2: Let Ai and A2 be hermitian matrices. Show that if we define A3 
by the relation [Ai, A2] = 1X3, then A3 is also a hermitian matrix. 

Exercise 6.3: For the group O(n) the matrices "iA" are real n-by-n skew 
symmetric matrices A. Show that if A\ and A2 are real skew symmetric 
matrices, then so is L4i,t4 2 ]. 

Exercise 6.4: For the group Sp(2n, M) the i\ matrices are of the form 

where a is any real n-by-n matrix and b and c are symmetric (a T = a and 
b T = b) real n-by-n matrices. Show that the commutator of any two matrices 
of this form is also of this form. 

6.2.1 Invariant vector fields 

Consider a matrix group, and in it a group element I + ie\ lying close to 
the identity e = I. Draw an arrow connecting / to / + ie\, and regard 
this arrow as a vector Lj lying in TG e . Next map the infinitesimal element 
/ + ie\i to the neighbourhood an arbitrary group element g by multiplying 
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on the left to get g(I + ie\). By drawing an arrow from g to g(I + ie\), we 
obtain a vector Li(g) lying in TG g . This vector at g is the push forward of 
the vector at e by left multiplication by g. For example, consider SU(2) with 
infinitesimal element I + iea 3 . We find 

g(I + iea 3 ) = (x° + ix l cti + ix 2 a 2 + ix 3 a 3 ) (I + iea 3 ) 

= (x° - ex 3 ) + ia^x 1 - ex 2 ) + ia 2 {x 2 + ex 1 ) + ia 3 (x 3 + ex ). 

(6.23) 

This computation can also be interpreted as showing that the multiplication 
of g G SU(2) on the right by (I + iea 3 ) displaces the point g, changing its x l 
parameters by an amount 



/x°\ 

x 1 
x 2 
\x 3 J 



( 



-x 3 \ 



X 

\ x° J 



(6.24) 



Knowing how the displacement looks in terms of the x 1 , x 2 , x 3 co-ordinate 
system lets us read off the d/dx^ components of the vector L 3 lying in TG g 



U = -x 2 8 l + x 1 d 2 + x°d 3 . 



(6.25) 



Since g can be any point in the group, we have constructed a globally defined 
vector field L 3 that acts on a function F(g) on the group manifold as 



Similarly we obtain 



L 3 F(g) = lim - [F (g(I + iea 3 )) - F{g)\ 

e^O | e 



(6.26) 



Li = x°d 1 - x 3 d 2 + x 2 d 3 
L 2 = x 3 d 1 +x°d 2 -x 1 d 3 . 



(6.27) 



The vector fields Lj are said to be left invariant because the push-forward 
of the vector L^g) lying in the tangent space at g by multiplication on the 
left by any g' produces a vector g'^L^g)] lying in the tangent space at g'g, 
and this pushed-forward vector coincides with the Li(g'g) already there. We 
can express this statement tersely as g*Li = Li. 
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Using diX° = —x l /x , % = 1,2,3, we can compute the Lie brackets and 
find 

[L 1 ,L 2 ] = -2L 3 . (6.28) 

In general 

[L u Lj] = -2e ijk L k: (6.29) 

which coincides with the matrix commutator of the i&i. 

This construction works for all Lie groups. For each basis vector Li in the 
tangent space at the identity e, we push it forward to the tangent space at g 
by left multiplication by g, and so construct the global left-invariant vector 
field Li. The Lie bracket of these vector fields will be 

[L i ,L j ] = -f ij k L k , (6.30) 

where the coefficients k are guaranteed to be position independent because 
(see exercise 3.5) the operation of taking the Lie bracket of two vector fields 
commutes with the operation of pushing-forward the vector fields. Con- 
sequently the Lie bracket at any point is just the image of the Lie bracket 
calculated at the identity. When the group is a matrix group, this Lie bracket 
will coincide with the commutator of the i\i, that group's analogue of the 
i&i matrices. 



The Exponential Map 

Recall that given a vector field X = X M <9 M we define associated flow by 
solving the equation 

— =X"(x(t)). (6.31) 

If we do this for the left-invariant vector field L, with initial condition 
x(0) = e, we obtain a t-dependent group element g(x(t)), which we denote 
by Exp (tL). The symbol "Exp" stands for the exponential map which takes 
elements of the Lie algebra to elements of the Lie group. The reason for the 
name and notation is that for matrix groups this operation corresponds to 
the usual exponentiation of matrices. Elements of the matrix Lie group are 
therefore exponentials of matrices in the the Lie algebra. To see this suppose 
that Li is the left invariant vector field derived from i\. Then the matrix 

g{t) = exp(iiAi) = I + it\ - -t 2 \ 2 - i-t 3 X 3 + ■■■ (6.32) 

2 3. 
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is an element of the group, and 

g (t + e) = exp(itA) exp(ieAi) = g(t) (i + ie\ + 0(e 2 ) ) . (6.33) 
From this we deduce that 

j t g{t) = lim {- e [g(t)(I + ieA.) - <?(*)]} = L#(f). (6.34) 

Since exp(itA) = / when t — 0, we deduce that Exp (iLj) = exp(itAj). 
Right-invariant vector fields 

We can use multiplication on the right to push forward an infinitesimal group 
element. For example: 

(I + iea 3 )g = (I + iea 3 )(x° + ix x ij\ + ix 2 a 2 + ix 3 a 3 ) 

= (x° - ex 3 ) + ia^x 1 + ex 2 ) + ia 2 {x 2 - ex 1 ) + ia 3 (x 3 + ex ). 

(6.35) 

This motion corresponds to the right-invariant vector field 

R 3 = x 2 d l - x 1 d 2 + x°d 3 . (6.36) 

Similarly, we obtain 

R 1 = x 3 d 1 -x°d 2 + x 1 d 3 

R 2 = X °d 1 + x 3 d 2 -x 2 d 3 , (6.37) 

and find that 

[R 1 ,R 2 \ = +2R 3 . (6.38) 

In general, 

[R i ,R j ]=+2e ijk R k . (6.39) 
For any Lie group, the Lie brackets of the right-invariant fields will be 

[Ri,Rj] =+fij k R k . (6.40) 

whenever 

[Li, Lj] = —fij k L k , (6.41) 
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are the Lie brackets of the left-invariant fields. The relative minus sign be- 
tween the bracket algebra of the left and right invariant vector fields has 
the same origin as the relative sign between the commutators of space- and 
body-fixed rotations in classical mechanics. Because multiplication from the 
left does not interfere with multiplication from the right, the left and right 
invariant fields commute: 

[L t ,R :i } = 0. (6.42) 

6.2.2 Maurer-Cartan Forms 

If g G G, then dg g^ 1 G LieG. For example, starting from 

g = x° + ix l &i + ix 2 &2 + ix 3 a 3 
g' 1 = x° — ix x b\ — ix 2 a 2 — ix 3 a 3 (6.43) 

we have 

dg = dx° + idx l &i + idx 2 a 2 + idx 3 a 3 

= (x°) _1 (— x x dx x — x 2 dx 2 — x 3 dx 3 ) + idx x &i + idx 2 a 2 + idx 3 a 3 . 

(6.44) 

From this we find 

dgg- 1 = %a x ((x° + (x l ) 2 /x°)dx l + (x 3 + (x l x 2 )/x°)dx 2 + (-x 2 + (x 1 x 3 ) / 'x°)dx 3 ) 
+ia 2 ((-x 3 + (x 2 x 1 )/x°)dx 1 + (x° + (x 2 ) 2 /x°)dx 2 + (x 1 + (x 2 x 3 )/x°)dx 3 ) 
+ta 3 ({x 2 + (x 3 x 1 )/x°)dx 1 + {-x 1 + {x 3 x 2 )/x°)dx 2 + {x° + {x 3 f /x°)dx 3 ) . 

(6.45) 

The part proportional to the identity matrix has cancelled. The result is 
therefore a Lie algebra- valued 1-form. We define the (right invariant) Maurer- 
Cartan forms oj 1 r by 

dgg' 1 = uj r = (iai)u % R . (6.46) 
If we evaluate one-form uj r on the right invariant vector field Ri, we find 

u^Rx) = (x° + (x 1 ) 2 /x°)x° + (x 3 + (x 1 x 2 )/x°)x 3 + (-x 2 + (x 1 x 3 )/x°)(-x 2 ) 
= (x ) 2 + (x 1 ) 2 + (x 2 ) 2 + (x 3 ) 2 

= 1. (6.47) 
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Working similarly, we find 

c4(i2 2 ) = ( x + (x 1 ) 2 /x )(-x 3 ) + (x 3 + (x 1 x 2 )/x )x + (-x 2 + (x 1 x 3 )/x°)x 1 
= 0. (6.48) 

In general we discover that uj r (Rj) = 5j. These Maurer-Cartan forms there- 
fore constitute the dual basis to the right-invariant vector fields. 
We may also define the left invariant Maurer-Cartan forms 

g~ x dg = uj l = (6.49) 

These obey u l L (Lj) = <5], showing that the oj % l are the dual basis to the 
left-invariant vector fields. 

Acting with the exterior derivative d on gg^ 1 = I tells us that d(g~ 1 ) = 
—g~ x dgg~ x . By exploiting this fact, together with the anti-derivation prop- 
erty 

d(aAb) = daAb+ (-l) p a A db, 
we may compute the exterior derivative of ujr. We find that 

dw R = didgg' 1 ) = (dgg' 1 ) A (dgg' 1 ) = u R A u R . (6.50) 

A matrix product is implicit here. If it were not, the product of the two 
identical 1-forms on the right would automatically be zero. If we make this 
matrix structure explicit we find that 

= -^/(^)^Au4 (6.51) 

so 

4-^MA< (6.52) 

These equations are known as the Maurer-Cartan relations for the right- 
invariant forms. 

For the left-invariant forms we have 

du L = d(g~ l dg) = -(g^dg) A (g^dg) = -u L A u L , (6.53) 
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or 



du k L = +\f ij k <<>i*<<i- (6-54) 

The Maurer-Cartan relations appear when we quantize gauge theories. 
They are one part of the BRST transformations of the Fadeev-Popov ghost 
fields. 



6.2.3 Euler Angles 

In physics it is common to use Euler angles to parameterize SU(2). We can 
write an arbitrary SU(2) matrix U as a product 

U = exp{— 20<7 3 /2} exp{— i9a 2 /2} exp{— iipa 3 /2}, 

■ e -i4>/2 \ / cos 0/2 - s in^/2\ /e"^/ 2 



e i<t>/2 ) \ sin 0/2 cos 0/2 J \ e^ /2 
e -i(4>+^)/2 cos q / 2 _ e *W-*)/2 sin q / 2 

e^-V>)/2 sin 0/ 2 e +i ^+^/ 2 cos 0/2 



(6.55) 



Comparing with the earlier expression for U in terms of the we obtain 
the Euler-angle parameterization of the three-sphere 

x° = cos0/2cos(^ + 0)/2, 

x 1 = sin0/2sin(0 - ip)/2, 

x 2 = -sin0/2cos(0-^)/2, 

a; 3 = -cos^/2sin(^ + 0)/2. (6.56) 

If the angles are taken in the range < <fi < 2n, < < ir, < ip < An we 
cover the entire three-sphere once. 

Exercise 6.5: Show that the Hopf map, defined in chapter 3, Hopf : S 3 — > S 2 
is the "forgetful" map (8,<p,tp) — > (9,<p), where 9 and <fr are spherical polar 
co-ordinates on the two-sphere. 



Exercise 6.6: Show that 



u- 1 du = ~a i nl 



where 



= si n ip d9 — sin 9 cos ip i 
= dip + cos « 



f2 2 = cos ip d9 — sin sin ?/> d(f>, 



6.2. GEOMETRY OF SU (2) 



221 



Compare these 1-forms with the components 

u>x = sin if) 9 — sin 9 cos if) <p, 
ojy = cos if) 9 — sin 9 sin if) <fi, 
u)z = ip + cos 9 (p. 

of the angular velocity u of a body with respect to the body-fixed XYZ axes 
in the Euler-angle conventions of exercise 2.17. 

Similarly show that 
where 

Or = — sin (pd9 + sin 6* cos if) dip, 

R = cos cpd9 + smO simp dip, 

R = d(p + cos 9 dip, 

Compare these 1-forms with components u> x , u) y , u) z of the same angular ve- 
locity vector u, but now with respect to the space- fixed xyz frame. 

6.2.4 Volume and Metric 

The manifold of any Lie group has a natural metric which is obtained by 
transporting the Killing form (see section 6.3.2) from the tangent space at 
the identity to any other point g by either left or right multiplication by 
g. In the case of a compact group, the resultant left and right invariant 
metrics coincide. In the case of SU(2) this metric is the usual metric on the 
three-sphere. 

Using the Euler angle expression for the to compute the dx^, we can 
express the metric on the sphere as 

«ds 2 " = (dx Q ) 2 + (dx l ) 2 + (dx 2 ) 2 + (dx z ) 2 , 

= ^ (d9 2 + cos 2 9/2(diP + d<P) 2 + sin 2 9/2(dip - d<P) 2 ) , 

= j (d9 2 + dip 2 + d(P 2 + 2 cos 9d(Pd^j) . (6.57) 

Here, to save space, we have used the traditional physics way of writing a 
metric. In the more formal notation, where we think of the metric as being 
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a bilinear function, we would write the last line as 

g{ , ) = | (d0 <g) dO + dip ® dip + d<f> <g) d(p + cos6(d<p ® dip + dip <g) d<f>)) 



From (6.58) we find 



(6.58) 



g = det(g^) = — 



64 



1 

1 cos 9 
cos# 1 

(1 - cos 2 #) = -^sin 2 # 



64 



The volume element, y/g d9d<pdip, is therefore 



d(Volume) = - sin 9d9d(pdip, 
8 



(6.59) 



(6.60) 



and the total volume of the sphere is 



1 P7Y PZTT /»47T 

Vo\(S 3 ) = - sm9d9 #/ # = 2tt 2 . (6.61) 
° Jo Jo Jo 

This coincides with the standard expression for the volume of S' 1 " 1 , the 
surface of the rf-dimensional unit ball, 



Vol^- 1 ) 



27T^ 



(6.62) 



when d — 4. 



Exercise 6. 7: Evaluate the Maurer-Cartan form lo^ in terms of the Euler angle 
parameterization and show that 

1 i 
iu\ = -tr (a^U^dU) = --{dtp + cos 9 dcp). 

Now recall that the Hopf map takes the point on the three-sphere with Euler 
angle co-ordinates (6, (p, ip) to the point on the two-sphere with spherical polar 
co-ordinates (9, <p). Thus, if we set A = —dip — cos 9 dip, then we find 



F = dA = sm9d9d(p = Hopf*(d[AreaS 2 ]). 
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Also observe that 

A A F = - sin 6 dOd^dtp. 
From this show that Hopf index of the Hopf map itself is equal to 

ibL^"— 1 - 

Exercise 6.8: Show that for U the defining two-by-two matrices of SU(2), we 
have 

f tv[(U' 1 dUf]=2A7r 2 . 

Js\J(2) 

Suppose we have a map g : M 3 — > SU(2) such that g(x) goes to the identity 
element at infinity Consider the integral 

where the 3-form tr {g~ l dg) 3 is the pull-back to IR 3 of the form tr [{U~ l dU) 3 ] 
on SU(2). Show that if we vary g — > g + Sg, then 

SS[g] = ^^d{3tT( K (g- 1 Sg)(g- 1 dg) 2 )}=0, 

and so S[g] is topological invariant of the map g. Conclude that the functional 
S[g] is an integer, that integer being the Brouwer degree, or winding number, 
of the map g : S 3 —> S 3 . 

Exercise 6.9: Generalize the result of the previous problem to show, for any 
mapping x i— ► g(x) into a Lie group G, and for n an odd integer, that the 
n-form tr (g~ 1 dg) n constructed from the Maurer-Cartan form is closed, and 
that 

5tr{g- 1 dg) n = d{ntr ((^ 1 ^)(^ 1 ^) n " 1 ) } • 
(Note that for even n the trace of {g~ 1 dg) n vanishes identically.) 

6.2.5 SO(3) ~ SU(2)/Z 2 

The groups SU(2) and SO(3) are locally isomorphic. They have the same 
Lie algebra, but differ in their global topology. Although rotations in space 
are elements of SO (3), electrons respond to these rotations by transforming 
under the two-dimensional denning representation of SU(2). As we shall see, 
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this means that after a rotation through 2ir the electron wavefunction comes 
back to minus itself. The resulting topological entanglement is characteristic 
of the spinor representation of rotations and is intimately connected with 
the Fermi statistics of the electron. The spin representations were discovered 
by Elie Cartan in 1913, long before they were needed in physics. 

The simplest way to motivate the spin/rotation connection is via the 
Pauli sigma matrices. These matrices are hermitian, traceless, and obey 

&id-j + (Xj(Xj = 28ijl, (6.63) 

If, for any U G SU(2), we define 

= UaiU~\ (6.64) 

then the are also hermitian, traceless, and obey (6.63). Since the original 
<7j form a basis for the space of hermitian traceless matrices, we must have 

<7- = djRji (6.65) 

for some real 3-by-3 matrix having entries R^. From (6.63) we find that 

a i a j + GjG i 

(&lRli)(&mRmj) + ( a mRmj)(&lRli) 
{Pl&m + &m&l)RliRmj 
^ImRliRmj ■ 

Thus 

RmiRmk = <$ik- (6.66) 

In other words, R T R = I, and R is an element of 0(3). Now the determinant 
of any orthogonal matrix is ±1, but the manifold of SU(2) is a connected set 
and R = I when U — I. Since a continuous map from a connected set to 
the integers must be a constant, we conclude that det R = 1 for all U. The 
R matrices are therefore in SO (3). 

We now exploit the principle of the sextant to show that the correspon- 
dance goes both ways, i.e. we can find a U(R) for any element R G S0(3). 
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To sun 



26 




Left-hand half of fixed 
mirror is silvered. Right- 
hand half is transparant 



View through telescope 
of sun brought down to 
touch horizon 



To Horizon 




Figure 6.1: The sextant. 



This familiar instrument is used to measure the altitude of the sun above the 
horizon while standing on the pitching deck of a ship at sea. A theodolite or 
similar device would be rendered useless by the ship's motion. The sextant 
exploits the fact that successive reflection in two mirrors inclined at an angle 
9 to one another serves to rotate the image through an angle 29 about the 
line of intersection of the mirror planes. This rotation is used to superimpose 
the image of the sun onto the image of the horizon, where it stays even if 
the instrument is rocked back and forth. Exactly the same trick is used in 
constructing the spinor representations of the rotation group. 

To do this, consider a vector x with components x l and form the matrix 
x = x % &i. Now, if n is a unit vector with components n\ then 

(-& i n i )x(& k n k ) = (x j - 2(n ■ x)(n 3 ')) frj = x - 2(n ■ x)n (6.67) 

The vector x — 2(n-x)n is the result of reflecting x in the plane perpendicular 
to n. Consequently 



-(<7i cos6>/2 + <7 2 sin0/2)(-oi) x(<xi)(oi cos 9/2 + cr 2 sin 9/2) 



(6.68) 
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performs two successive reflections on x, first in the "1" plane, and then in 
a plane at an angle 9/2 to it. Multiplying out the factors, and using the &i 
algebra, we find 

(cos 9/2 - a x a 2 sin 6>/2)x(cos 9/2 + Oxo 2 sin 9/2) 

= a l (cos9x 1 - sin9x 2 ) + a 2 (sin9x 1 + cos9x 2 ) + a 3 x 3 . (6.69) 

The effect on x is a rotation through 9, as claimed. We can drop the x l and 
re-express (6.69) as 

U&iU- 1 = frjRji, (6.70) 

where R^ is the 3-by-3 rotation matrix for a rotation through angle 9 in the 
1-2 plane, and 

U = exp j-^} = exp (6.71) 

is an element of SU(2). We have exhibited two ways of writing the exponents 
in (6.71) because the subscript 3 on <t 3 indicates the axis about which we are 
rotating, while the 1,2 in [<Ti,o"2] indicates the plane in which the rotation 
occurs. It is the second language that generalizes to higher dimensions. More 
on the use of mirrors for creating and combining rotations can be found in 
the the appendix to Misner, Thorn, and Wheeler's Gravitation. 

The mirror construction shows that for any R G SO (3) there is a two- 
dimensional unitary matrix U (R) such that 

U(R)a i U- 1 (R) = djRji. (6.72) 

This U(R) is not unique however. If U e SU(2) then so is —U. Furthermore 

U{R)aiU-\R) = (-U(R))d-i(-U(R))-\ (6.73) 

and so U(R) and —U (R) implement exactly the same rotation R. Conversely, 
if two SU(2) matrices U, V obey 

UcTiU- 1 = VoiV- 1 (6.74) 

then V~ l U commutes with all 2-by-2 matrices and, by Schur's lemma, must 
be a multiple of the identity. But if XI E SU(2) then A = ±1. Thus U = ±V. 
The mapping between SU(2) and SO(3) is therefore two-to-one. Since U and 
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—U correspond to the same R, the group manifold of SO (3) is the three- 
sphere with antipodal points identified. Unlike the two-sphere, where the 
identification of antipodal points gives the non-orientable projective plane, 
this three-manifold is is orientable. It is not, however, simply connected: a 
path on the three-sphere from a point to its antipode forms a closed loop 
in SO(3), but one not contractable to a point. If we continue on from the 
antipode back to the original point, the combined path is contractable. This 
means that the first Homotopy group, the group of based paths with composi- 
tion given by concatenation, is 7Ti(SO(3)) = Z 2 . This is the topology behind 
the Phillipine (or Balinese) Candle Dance, and is how the electron knows 
whether a sequence of rotations that eventually bring it back to its original 
orientation should be counted as a 360° rotation (U = —I) or a 720° ~ 0° 
rotation (U = +1). 

Exercise 6.10: Verify that 

U(R)a i U- 1 (R) = ajRji 
is consistent with U(R 2 )U(R 1 ) = ±U(R 2 Ri). 

Spinor representations of SO(iV) 

The mirror trick can be extended to perform rotations in N dimensions. We 
replace the three d{ matrices by a set of N Dirac gamma matrices, which 
obey the defining relations of a Clifford algebra 



These relations are a generalization of the key algebraic property of the Pauli 
sigma matrices. 

If N (= 2n) is even, then we can find 2 n -by-2 n hermitian matrices, 7 M , 
satisfying this algebra. If N (= 2n + 1) is odd, we append to the matrices for 
N = 2n the hermitian matrix % n+ i = — (i) n ^i% • • -l2n which obeys 7| n+1 = 
1 and anti-commutes with all the other 7^. The 7 matrices therefore act on 
a 2' Ar / 2 l dimensional space, where the square brackets denote the integer part 



The 7's do not form a Lie algebra as they stand, but a rotation through 
6 in the mn-plane is obtained from 



(6.75) 



of N/2. 



(6.76) 
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and we find that the hermitian matrices f mn = ^[7 m , 7„] form a basis for the 
Lie algebra of SO(iV). The 2^/^ dimensional space on which they act is the 
Dirac spinor representation of SO(iV). Although the matrices expfor^O^} 
are unitary, they are not, in general, the entirety of \](2^ N ^), but instead 
constitute a subgroup called Spin(iV). 

If iV is even then we can still construct the matrix 72„+i that anti- 
commutes with all the other 7 M 's. It cannot be the identity matrix, therefore, 
but it commutes with all the L mn . By Schur's lemma, this means that the 
SO(2n) Dirac spinor representation space V is reducible. Now 7f n+1 = /, 
and so 72 n +i has eigenvalues ±1. The two eigenspaces are invariant under 
the action of the group, and thus the Dirac spinor space decomposes into two 
irreducible Weyl spinor representations 

V = Kdd © Kven- (6.77) 

Here Kvcn and V 0< id, the plus and minus eigenspaces of 72^+1 , are called the 
spaces of right and left chirality. When N is odd the spinor representation 
is irreducible. 



Exercise 6.11: Starting from the defining relations of the Clifford algebra (6.75) 
show that, for N = 2n, 

tr(>) = 0, 

tr(7 2n+ i) = 0, 

= tr(7)(^, 

tr {%%%) = 0, 

tr (7^7,70- 7r) = tr (I) (S^do-r - 5^5^ + 



Exercise 6.12: Consider the space f2(C) = ® p ^ p (C) of complex-valued skew 
symmetric tensors for < p < N = 2n. Let 

N 1 

= ^2 -j (7mi " " " ^^v)al3 
p=0 1 

define a mapping from Q(C) into the space of complex matrices of the same 
size as the 7^. Show that this mapping is invertible — i.e. given ip a g you can 
recover the ^4 Ml ...^ p . By showing that the dimension of 0(C) is 2^, deduce 
that the 7^ must be at least 2 n -by-2 n matrices. 
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Exercise 6.13: Show that the R 2n Dirac operator D = l^d^ obeys D 2 = V 2 . 
Recall that Hodge operator d — 5 from section 4.7.1 is also a "square root" of 
the Laplacian: 

(d- 8f = ~{d5 + 5d) = V 2 . 

Show that 

corresponds to the action of d — 5 on the space Sl(M 2n , C) of differential forms 

A = ^A^ p (x)dx^---dx^ 

The space of complex- valued differential forms has thus been made to look like 
a collection of 2 n Dirac spinor fields, one for each value of the "flavour index" 
(3. These rb a p are called Kahler-Dirac fields. They are not really flavoured 
spinors because a rotation transforms both the a and (3 indices. 

Exercise 6.14: That a set of 2n Dirac 7's have a 2"-by-2 n matrix representation 
is most naturally established by using the tools of second quantization. To this 
end, let a^, a| i = 1, . . . , n be set of anti-commuting annihilation and creation 
operators obeying 

ciiCij + ajtti = 0, did- + ojoj = 5ijl, 

and let |0) be the "no particle" state such that aj|0) = 0, i = 1, . . . , n. Then 
the 2 n states 

\m 1 ,...,m n ) = (a\r-...(ai) m -\0), 

where the rrtj take the value or 1, constitute a basis for a space on which the 
cij and a\ act irreducibly. Show that the 2n operators 

7, = a; + a] 
j i+n = i(ai- a\) 

obey 

and hence can be represented by 2 ra -by-2 ra matrices. Deduce further that 
spaces specs of left and right chirality are the spaces of odd or even "particle 
number." 
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The Adjoint Representation 

The spin/rotation correspondence involves conjugation: &i — > UaiU^ 1 . The 
idea of obtaining a representation by conjugation works for an arbitrary Lie 
group. It is easiest, however, to describe in the case of a matrix group 
where we consider an infinitesimal element / + %e\. The conjugate element 
g(I + ie\i)g~ l will also be an infinitesimal element. Since gig' 1 = I, this 
means that giiX^g -1 must be expressible as a linear combination of the iXi 
matrices. Consequently we can define a linear map acting on the element 
X = £ \i of the Lie algebra by setting 

Ad(g)X i = gX i g- 1 ^X j [Ad(g)] j i . (6.78) 

The matrices with entries [Ad (g)]^ form the adjoint representation of the 
group. The dimension of the adjoint representation coincides with that of 
the group manifold. The spinor construction shows that the defining repre- 
sentation of SO(3) is the adjoint representation of SU(2). 

For a general Lie group, we make Ad(g) act on a vector in the tangent 
space at the identity by pushing the vector forward to TG g by left multiplica- 
tion by g, and then pushing it back from TG g to TG e by right multiplication 
by g- 1 . 

Exercise 6.15: Show that 

[Ad(g 1 g 2 )Y i = [Ad(g 1 )y k [Ad(g 2 )} k i , 

thus confirming that Ad(g) is a representation. 

6.2.6 Peter- Weyl Theorem 

The volume element constructed in section 6.2.4 has the feature that it is 
invariant. In other words if we have a subset Q of the group manifold with 
volume V, then the image set gfl under left multiplication has the exactly the 
same volume. We can also construct a volume element that is invariant under 
right multiplication by g, and in general these will be different. For a group 
whose manifold is a compact set, however, both left- and right-invariant 
volume elements coincide. The resulting measure on the group manifold is 
called the Haar measure. 

For a compact group, therefore, we can replace the sums over the group 
elements that occur in the representation theory of finite groups, by con- 
vergent integrals over the group elements using the invariant Haar measure, 
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which is usually denoted by d[g\ . The invariance property is expressed by 
d[gig] = d[g] for any constant element g\. This allows us to make a change- 
of-variables transformation, g — > gig, identical to that which played such an 
important role in deriving the finite group theorems. Consequently, all the 
results from finite groups, such as the existence of an invariant inner product 
and the orthogonality theorems, can be taken over by the simple replacement 
of a sum by an integral. In particular, if we normalize the measure so that 
the volume of the group manifold is unity, we have the orthogonality relation 

/ d[g] {D J l3 {g)Y D&{g) = ^jS JK 5 u 5 jm . (6.79) 

The Peter- Weyl theorem asserts that the representation matrices, _D^ n (g), 
form a complete set of orthogonal functions on the group manifold. In the 
case of SU(2) this tells us that the spin J representation matrices 

D J mn {6,<t>^) = (J,m\e- u ^e^ 9 e- u ^\J,n), 

= e-^d J mn (6)e- m ^ (6.80) 

which you will know from quantum mechanics courses, 1 are a complete set 
of functions on the three-sphere with 



16 — J nnOde j d<pj^ dtp (D J rnn (6, (f>,ip))* -^m'n'(^) 0> VO 



= ^-^5 JJ '5 mm ,5 nn , (6.81) 

Since the D^ (where L has to be an integer for n = to be possible) are 
independent of the third Euler angle, ip, we can do the trivial integral over 
tp to get 



1 r7T p2tt 1 

-J sm9dej o d<P(DU0,<P)YD^ o (6^) = ^ rTY 5 LL '5 mm , (6.82) 



Comparing with the definition of the spherical harmonics, we see that we can 
identify 

XZ(e, 0) = J 2 -^ (duo, 0, VO)* • (6-83) 



1 See, for example, G. Baym Lectures on Quantum Mechanics, Ch 17. 
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The complex conjugation is necessary here because D^ n (9,(f),ip) oc e irn ^, 
while Y£(0,4>) oc e im f 

The character, x J (fiO — ^nnid) w i n be a function only of the angle 9 
we have rotated through, not the axis of rotation — all rotations through a 
common angle being conjugate to one another. Because of this x J (@) can be 
found most simply by looking at rotations about the z axis, since these give 
rise to easily computed diagonal matrices. We find 



x (0) = e iJO + e HJ-i)0 + ... + e -i(J-i)0 + e -u0 j 

sin(2J+ 1)9/2 
sin 9/2 ' 



(6.84) 



Warning: The angle 9 in this formula and the next is the not the Euler 
angle. 

For integer J, corresponding to non-spinor rotations, a rotation through 
an angle 9 about an axis n and a rotation though an angle 2n — 9 about — n 
are the same operation. The maximum rotation angle is therefore n. For 
spinor rotations this equivalence does not hold, and the rotation angle 9 runs 
from to 2ir. The character orthogonality must therefore be 

I J\ J (0)X J '(O) sin 2 (0 dS = 5 JJ \ (6.85) 

implying that the volume fraction of the rotation group containing rotations 
through angles between 9 and 9 + d9 is sin 2 {9/ '2) d9/n. 

Exercise 6.16: Prove this last statement about the volume of the equivalence 
classes by showing that the volume of the unit three-sphere that lies between 
a rotation angle of and + dO is 2ir sm 2 (6/2)d9. 



6.2.7 Lie Brackets vs. Commutators 

There is an irritating minus sign problem that needs to be acknowledged. 
The Lie bracket [X, Y\ of of two vector fields is defined by first running along 
X, then Y and then back in the reverse order. If we do this for the action of 
matrices, X and Y, on a vector space, however, then, reading from right to 
left as we always do for matrix operations, we have 

e -t2Y e - tl x e t2Y e tiX _ j _ tih {x i ¥] + ■■■, (6.86) 
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which has the other sign. Consider for example rotations about the x, y, z 
axes, and look at effect these have on the co-ordinates of a point: 



L x = yd z - zd y , 
L^ii — "tied % . 



5y = 


—z 56 x 


5z = 


+y59 x 


Sz = 


—x 59y 


5x = 


+z56 y 


5x = 


-yse z 


Sy = 


+x 56 z 



L z — xd y — yd x , L y 























1 






















-1 











-1 























From this we find 



[L x , L y ] — —L z , 
as a Lie bracket of vector fields, but 

[L x , L y ] = +L Z , 



(6.87) 



(6.88) 



as a commutator of matrices. This is the reason why it is the left invariant 
vector fields whose Lie bracket coincides with the commutator of the i\i 
matrices. 

Some insight into all this can be had by considering the action of the left 
invariant fields on the representation matrices, D^^g). For example 



L t D J mn (g) = Jim 



lim 



- e (D J m n(9(l + ie\))-D J mn (g)) 



-(D^MD^l + ieXd-D. 



j 

mn 



(<?)) 



lim 

e->0 



- ( D mn ,{g){5 nln + ie(Ai) n , n ) - D J mn (g) 



= D J mn ,{g){i\i) n , n 



(6.89) 



where kj is the matrix representing \ in the representation J. Repeating 
this exercise we find that 



L % {L 3 D J mn {g)) = D^ n „(g)(ih{) n n n ,(ih J j) n , n , 



(6.90) 
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Thus 

[Li, Lj]D J mn (g) = D J mn ,(g)[iA{,ikJ} n , n , (6.91) 

and we get the commutator of the representation matrices in the "correct" 
order only if we multiply the infinitesimal elements in successively from the 
right. 

There appears to be no escape from this sign problem. Many texts simply 
ignore it, a few define the Lie bracket of vector fields with the opposite sign, 
and a few simply point out the inconvenience and get on the with the job. 
We will follow the last route. 

6.3 Lie Algebras 

A Lie algebra g is a (real or complex) finite-dimensional vector space with a 
non-associative binary operation gxg^g that assigns to each ordered pair 
of elements, Xi,X 2 , a third element called the Lie bracket, [X 1 ,X 2 ]. The 
bracket is: 

a) Skew symmetric: [X,Y] = ~[Y,X], 

b) Linear: [AX + fiY, Z] = X[X, Z] + fi[Y, Z], 
and in place of associativity, obeys 

c) The Jacobi identity: [[X, Y],Z] + [[Y, Z],X] + [[Z, X],Y] = 0. 

Example: Let M{n) denote the algebra of real n-by-n matrices. As a vector 
space over R, this algebra is n 2 dimensional. Setting [A, B] = AB — BA, 
makes M(n) into a Lie Algebra. 

Example: Let b + denote the subset of M(n) consisting of upper triangular 
matrices with any number (including zero) allowed on the diagonal. Then 
b + with the above bracket is a Lie algebra. (The "b" stands for the French 
mathematician and statesman Emile Borel). 

Example: Let n + denote the subset of b + consisting of strictly upper trian- 
gular matrices — those with zero on the diagonal. Then n + with the above 
bracket is a Lie algebra. (The "n" stands for nilpotent.) 
Example: Let G be a Lie group, and Li the left invariant vector fields. We 
know that 

[L i ,L j ]=f ij k L k (6.92) 

where [ , ] is the Lie bracket of vector fields. The resulting Lie algebra, 
g = Lie G is the Lie algebra of the group. 
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Example: The set N + of upper triangular matrices with l's on the diagonal 
forms a Lie group and has n + as its Lie algebra. Similarly, the set B + 
consisting of upper triangular matrices, with any non-zero number allowed 
on the diagonal, is also a Lie group, and has b + as its Lie algebra. 

Ideals and Quotient algebras 

As we saw in the examples, we can define subalgebras of a Lie algebra. If 
we want to define quotient algebras by analogy to quotient groups, we need 
a concept analogous to that of invariant subgroups. This is provided by the 
notion of an ideal. A ideal is a subalgebra iCg with the property that 

M]Ci. (6.93) 

In other words, taking the bracket of any element of g with any element 
of i gives an element in i. With this definition we can form g — i by identifying 
X ~ X + I for any I E i. Then 

[X + i,Y + i] = [X, Y\ + i, (6.94) 

and the bracket of two equivalence classes is insensitive to the choice of 
representatives. 

If a Lie group G has an invariant subgroup H which is also a Lie group, 
then the Lie algebra f) of the subgroup is an ideal in g = Lie G and the Lie 
algebra of the quotient group G/H is the quotient algebra g — f). 

If the Lie algebra has no non-trivial ideals, then it is said to be simple. 
The Lie algebra of a simple Lie group will be simple. 

Exercise 6.17: Let ii and \2 be ideals in g. Show that ii ni2 is also an ideal in 
0- 

6.3.1 Adjoint Representation 

Given an element X e g let it act on the Lie algebra considered as a vector 
space by a linear map ad (x) defined by 

ad(X)F = [X,Y]. (6.95) 

The Jacobi identity is then equivalent to the statement 



(ad (X)ad (Y) - ad (F)ad (X)) Z = ad ([X, Y])Z. 



(6.96) 
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Thus 

(ad (X)ad (Y) - ad (F)ad (X)) = ad ([X, Y]), (6.97) 

or 

[ad (X), ad (Y)} = ad ([X, Y]), (6.98) 

and the map X — > ad (X) is a representation of the algebra called the adjoint 
representation. 

The linear map "ad (X)" exponentiates to give a map exp[ad (tX)] defined 

by 

exp[ad (tX)]Y = Y + t[X, Y] + h 2 [X, [X, Y]] -\ . (6.99) 

You probably know the matrix identity 2 

e tA Be -tA = B + f ^ ^ + l f 2[ Aj 5]] + • • • . (6.100) 

Now, earlier in the chapter, we defined the adjoint representation "Ad" of 
the group on the vector space of the Lie algebra. We did this setting gXg v = 
Ad (g)X. Comparing the two previous equations we see that 

Ad (Exp Y) = exp(ad (Y)). (6.101) 
6.3.2 The Killing form 

Using "ad" we can define an inner product ( , ) on a real Lie algebra by 
setting 

(X,Y) =tr(ad(X)ad(y)). (6.102) 

This inner product is called the Killing form, after Wilhelm Killing. Using 
the Jacobi identity, and the cyclic property of the trace, we find that 

(&d(X)Y,Z) + (Y,&d(X)Z) = 0, (6.103) 

or, equivalently, 

([X,Y],Z) + (Y,[X,Z]) = 0. (6.104) 
From this we deduce (by differentiating with respect to t) that 

(exp(ad {tX))Y, exp(ad (tX))Z) = (Y, Z), (6.105) 



2 In case you do not, it is easily proved by setting F(t) = e tA Be~ tA , noting that 
^F(t) — [A,F(t)], and observing that the RHS is the unique series solution to this 
equation satisfying the boundary condition F(0) = B. 
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so the Killing form is invariant under the action of the adjoint representation 
of the group on the algebra. When our group is simple, any other invariant 
inner product will be proportional to this Killing-form product. 

Exercise 6.18: Let i be an ideal in g. Show that for G i 

(/ 1 ,/ 2 ) = (/ 1 ,/ 2 ) i 

where ( , )< is the Killing form on i considered as a Lie algebra in its own 
right. (This equality of inner products is not true for subalgebras that are not 
ideals.) 

Semi-simplicity 

Recall that a Lie algebra containing no non-trivial ideals is said to be simple. 
When the Killing form is non degenerate, the Lie Algebra is said to be semi- 
simple. The reason for this name is that a semi-simple algebra is almost 
simple, in that it can be decomposed into a direct sum of decoupled simple 
algebras 

= Si©s 2 ©---©5„. (6.106) 

Here the direct sum symbol "©" implies not only a direct sum of vector 
spaces but also that [Si,$j] = for i ^ j . 

The Lie algebra of all the matrix groups O(n), Sp(n), SU(n), etc. are 
semi-simple (indeed they are usually simple) but this is not true of the alge- 
bras n + and b + . 

Cartan showed that our Killing-form definition of semi-simplicity is equiv- 
alent his original definition of a Lie algebra being semi-simple if it contains 
no abelian ideal — i.e. no ideal with [h,Ij] = for all ij G i. The following 
exercises establish the direct sum decomposition, and, en passant, the easy 
half of Cartan's result. 

Exercise 6.19: Use the identity (6.104) to show that if i C is an ideal, then 
r 1 , the set of elements orthogonal to i with respect to the Killing form, is also 
an ideal. 

Exercise 6.20: Show that if o is an abelian ideal, then every element of o is 
Killing perpendicular to the entire Lie algebra. (Thus non-degeneracy => no 
non-trivial abelian ideal. The null space of the Killing form is not necessarily 
an abelian ideal, though, so establishing the converse is harder.) 
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Exercise 6.21 : Let q be semi-simple and iCgan ideal. We know from exercise 
6.17 that i n v 1 is an ideal. Use (6.104) coupled with the non-degeneracy of 
the Killing form to show that it is an abelian ideal. Use the previous exercise 
to conclude that i n r 1 = {0}, and from this that [i, v 1 ] = 0. 

Exercise 6.22: Let ( , ) be a non-degenerate inner product on a vector space 
V. Let W C V be a subspace. Show that 

dim W + dim W 1 " = dim V. 

(This is not as obvious as it looks. For a non-positive-definite inner product 
W and W L can have a non-trivial intersection. Consider two-dimensional 
Minkowski space. If W is the space of right-going, light-like, vectors then 
W = W 1 , but dimlU + dimtU- 1 still equals two.) 

Exercise 6.23: Put the two preceding exercises together to show that 

Show that i and i 1 - are semi-simple in their own right as Lie algebras. We can 
therefore continue to break up i and i 1 - until we end with q decomposed into 
a direct sum of simple algebras. 

Compactness 

If the Killing form is negative definite, a real Lie Algebra is said to be com- 
pact, and is the Lie algebra of a compact group. With the physicist's habit 
of writing iXi for the generators of the Lie algebra, a compact group has 
Killing metric tensor 

^• = tr{ad(X l )ad(X J )} (6.107) 

that is a positive definite matrix. In a basis where gij = Sij, the exp(adX) 
matrices of the adjoint representations of a compact group G form a subgroup 
of the orthogonal group O(N), where iV is the dimension of G. 

Totally anti-symmetric structure constants 

Given a basis iXi for the Lie-algebra vector space, we define the structure 
constants fij k by 

[X i ,X j ]=if ij k X k . (6.108) 
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In terms of the fij k , the skew symmetry of ad (Xj), as expressed by equation 
(6.103), becomes 

= (ad (X k )Xi, X^ + iXi, ad (X k )Xj) 
= ([X k , Xi],Xj) + (Xi, [X k , Xj}) 
= i(fki l 9ij + Qiifk/) 

= i(fkij + fkji). (6.109) 

In the last line we have used the Killing metric to "lower" the index I and so 
define the symbol fij k . Thus fij k is skew symmetric under the interchange 
of its second pair of indices. Since the skew symmetry of the Lie bracket 
ensures that fij k is skew symmetric under the interchange of the first pair of 
indices, it follows that fij k is skew symmetric under the interchange of any 
pair of its indices. 

By comparing the definition of the structure constants with 

[X,, X,] = ad (Xi)Xj = X fe [ad (Xj)] fc ., (6.110) 

we read-off that the matrix representing ad (Xj) has entries 

[(ad (*)]*,. = </<,■*. (6.111) 

Consequently 

9ij = tr{ad(X)ad(X,)} = -f ik l f/. (6.112) 
The quadratic Casimir 

The only "product" that is defined in the abstract Lie algebra q is the Lie 
bracket [X, Y\. Once we have found matrices forming a representation of 
the Lie algebra, however, we can form the ordinary matrix product of these. 
Suppose that we have a Lie algebra q with basis Xj and have found matrices 
Xj with the same commutation relations as the Xj. Suppose further that the 
algebra is semisimple and so g lj , the inverse of the Killing metric, exists. We 
can use g l i to construct the matrix 

C 2 = g ii X i X j . (6.113) 

This matrix is called the quadratic Casimir operator, after Hendrik Casimir. 
Its chief property is that it commutes with all the Xf 



[C 2 ,X t ] = 0. 



(6.114) 
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If our representation is irreducible then Shur's lemma tells us that 

C 2 = c 2 I (6.115) 

where the number c 2 is referred to as the "value" of the quadratic Casimir 
in that irrep. 3 

Exercise 6.24: Show that [C2,^Q] = is another consequence of the complete 
skew symmetry of the fijk- 

6.3.3 Roots and Weights 

We now want to study the representation theory of Lie groups. It is, in fact, 
easier to study the representations of the Lie algebra, and then exponentiate 
these to find the representations of the group. In other words given an 
abstract Lie algebra with bracket 

[X i ,X j ]=if ij k X k , (6.116) 

we seek to find all matrices Xf such that 

;.v/..v/; ij:/x;!. (6.117) 

(Here, as with the representations of finite groups, we use the superscript J to 
distinguish one representation from another.) Then, given a representation 
X( of the Lie algebra, the matrices 

D J GKfl) = exp{e*/}, (6.118) 

where g(£) = Exp {i^Xi}, will form a representation of the Lie group. To 
be more precise, they will form a representation of that part of the group 
which is connected to the identity element. The numbers £ l will serve as 
co-ordinates for some neighbourhood of the identity. For compact groups 
there will be a restriction on the range of the £ l because there must be for 

which exp ji^X/j = I. 

3 Mathematicians do sometimes consider formal products of Lie algebra elements X, Y e 
q. When they do, they equip them with the rule that XY - YX - [X, Y] = 0, where XY 
and YX are formal products, and [X, Y] is the Lie algebra product. These formal products 
are not elements of the Lie algebra, but instead live in an extended mathematical structure 
called the Universal enveloping algebra of g, and denoted by U(g). The quadratic Casimir 
can then be considered to be an element of this larger algebra. 
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SU(2) 

The quantum-mechanical angular momentum algebra consists of the com- 
mutation relation 

[J 1 ,J 2 ]=thJ 3 , (6.119) 

together with two similar equations related by cyclic permutations. This, 
once we set h — 1, is the Lie algebra su(2) of the group SU(2). The goal 
of representation theory is to find all possible sets of matrices which have 
the same commutation relations as these operators. Since the group SU(2) is 
compact, we can use the group- averaging trick from section 5.2.2 to define an 
inner product with respect to which these representations are unitary, and 
the matrices Jj hermitian. 

Remember how this problem is solved in quantum mechanics courses, 
where we find a representation for each spin j = |, 1, |, etc. We begin by 
constructing "ladder" operators 

j + = j 1+ u 2j j_ = jt = j x _ ij 2) (6.120) 

which are eigenvectors of ad ( J 3 ) 

a d(J 3 )J ± = [J 3 ,J ± ]=±J ± . (6.121) 

From (6.121) we see that if \j, m) is an eigenstate of J 3 with eigenvalue m, 
then J±\j, m) is an eigenstate of J 3 with eigenvalue mil. 

Now in any finite-dimensional representation there must be a highest 
weight state, \j,j), such that J 3 \j,j) = j\j,j) for some real number j, and 
such that J + \j,j) = 0. From \j,j) we work down by successive applications 
of J_ to find \j,j — 1), \j,j — 2)... We can find the normalization factors of 
the states \j,m) oc (J-Y~ m \j, j) by repeated use of the identities 

J+J- = (Jf + Ji + JD-iJi-Jz), 

J-J+ = (J x 2 + Jl + J 3 2 )-(J 3 2 + J 3 ). (6.122) 

The combination J 2 = Jf + Jf + Jf is the quadratic Casimir of su(2), and 
hence in any irrep is proportional to the identity matrix: J 2 = c^l ■ Because 

= \\J + \j,j}\\ 2 
= {h3\J ] +J+\id) 
= (j,j\J-J+\j,j) 

= (J,J\{J 2 -J3(J3 + 1))\J,J) 

= [c 2 -j(j + l)](j,j\j,j), (6.123) 



242 



CHAPTER 6. LIE GROUPS 



and = \\\j,j}\\ 2 is not zero, we must have c 2 = j(j + 1). 

We now compute 

(j,m\JLJ-\j,m) 
(j,m\J+J-\j,m) 
(j,m\(.J 2 - J 3 (J 3 -1)) \j,m) 
[]{] + 1) - m(m - l)](j,m\j,m), (6.124) 

and deduce that the resulting set of normalized states \j, m) can be chosen 
to obey 

Mh m ) = ™>\j> m )> 

J_\j,m) = y/j(j + 1) - m(m - l)\j,m- 1), 

J+\j,m) = y/j(j + l)-m(m + l)\j,m + l). (6.125) 

If we take j to be an integer or a half-integer, we will find that J_ \j, —j) = 0. 
In this case we are able to construct a total of 2j + 1 states, one for each 
integer-spaced m in the range —j<m<j. If we select some other fractional 
value for j, then the set of states will not terminate gracefully, and we will 
find an infinity of states with m < —j. These will have ||J_|j, m)|| 2 < 0, so 
the resultant representation cannot be unitary. 



J-\j,m)\\ 2 = 



SU(3) 

The strategy of finding ladder operators works for any semi-simple Lie al- 
gebra. Consider, for example, su(3) = Lie(SU(3)). The matrix Lie algebra 
su(3) is spanned by the Gell-Mann A-matrices 



Ai = 1 , A 




A 4 = , A 





A 7 = -% , A 8 = —= 1 , (6.126) 
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which form a basis for the real vector space of 3-by-3 traceless, hermitian 
matrices. They have been chosen and normalized so that 

ti(X i X j )=26 ij , (6.127) 

by analogy with the properties of the Pauli matrices. Notice that A 3 and A 8 
commute with each other, and that this will be true in any representation. 
The matrices 

t± = -(Ai±iA 2 ), 
v± = i(A 4 ±iA 5 ), 

u ± = ^(A 6 ±iA 7 ). (6.128) 

have unit entries, rather like the step up and step down matrices a± = 
|(<7i ± ia 2 ). 

Let us define Aj to be abstract operators with the same commutation 
relations as Aj, and define 

T± = ±(A 1 ±iA 2 ), 
V± = ^(A 4 ±*A 5 ), 

U± = i(A 6 ±^A 7 ). (6.129) 

These are simultaneous eigenvectors of the commuting pair of operators 
ad (A 3 ) and ad(A 8 ): 

ad(A 3 )T ± = [A 3 ,T±] =±2T±, 

ad(A 3 )V± = [A 3 ,V±]=±\4, 

ad(A 3 )C/ ± = {A 3 ,U ± ] = tU±, 

ad(A 8 )T± = [A 8 ,T±] = 

ad(A 8 )14 = [A S ,V±] = ±V3V±, 

ad(A 8 )f/ ± = [A S ,U±] = ±V3U±, (6.130) 

Thus, in any representation, the T±, U±, V±, act as ladder operators, chang- 
ing the simultaneous eigenvalues of the commuting pair A 3 , A 8 . Their eigen- 
values, A 3 , A 8 , are called the weights, and there will be a set of such weights 
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for each possible representation. By using the ladder operators one can go 
from any weight in a representation to any other, but you cannot get outside 
this set. The amount by which the ladder operators change the weights are 
called the roots or root vectors, and the root diagram characterizes the Lie 
algebra. 

X 8 





V3 j 




/ + 


-v— ) 






\ ^ 2 


v_ / 


\U_ 







Figure 6.2: The root vectors of su(3). 



In a finite-dimensional representation there must be a highest weight state 
IA3, As) that is killed by all three of U+, T + and V+. We can then obtain 
all other states in the representation by repeatedly acting on the highest 
weight state with [/_, T_ or VI and their products. Since there is usually 
more than one route by which we can step down from the highest weight 
to another weight, the weight spaces may be degenerate — i.e there may be 
more than one linearly independent state with the same eigenvalues of A3 
and Ag. Exactly what states are obtained, and with what multiplicity, is not 
immediately obvious. We will therefore restrict ourselves to describing the 
outcome of this procedure without giving proofs. 

What we find is that the weights in a finite-dimensional representation of 
su(3) form a hexagonally symmetric "crystal" lying on a triangular lattice, 
and the representations may be labelled by pairs of integers (zero allowed) 
p, q which give the length of the sides of the crystal. These representations 
have dimension d — ^(p + l)(q + l)(p + q + 2). 
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Figure 6.3: The weight diagram of the 24 dimensional irrep with p = 3, 
q — 1. The highest weight is shaded. 



Figure 6.3 shows the set of weights occurring in the representation of SU(3) 
with p = 3 and q = 1. Each circle represents a state, whose weight (A3, As) 
may be read off from the displayed axes. A double circle indicates that there 
are two linearly independent vectors with the same weight. A count confirms 
that the number of independent weights, and hence the dimension of the 
representation, is 24. For SU(3) representations the degeneracy — i.e. the 
number of states with a given weight — increases by unity at each "layer" 
until we reach a triangular inner core, all of whose weights have the same 
degeneracy. 

In particle physics applications representations are often labelled by their 
dimension. The defining representation of SU(3) and its complex conjugate 
are denoted by 3 and 3, 
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Figure 6.4: The weight diagrams of the irreps with p = 1, q = 0, and p = 0, 
q = 1, also known, respectively, as the 3 and the 3. 

while the weight diagrams of the eight dimensional adjoint represention and 
the 10 have shape shown in figure 6.5. 




Figure 6.5: The irreps 8 (the adjoint) and 10. 



Cartan algebras: roots and co-roots 

For a general simple Lie algebra we may play the same game. We first find a 
maximal linearly independent set of commuting generators, hi. The hi form 
a basis for the Cartan algebra, f), whose dimension is the rank of the Lie 
algbera. We next find ladder operators by diagonalizing the "ad" action of 
the hi on the rest of the algebra. 

&d(hi)e a = [hi,e a ] = oiie a . (6.131) 

The simultaneous eigenvectors e a are the ladder operators that change the 
eigenvalues of the hi. The corresponding eigenvalues a, thought of as vectors 
with components ojj, are the roots, or root vectors. The roots are therefore 
the weights of the adjoint representation. It is possible to put factors of "i" 
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in the appropriate places so that the ctj are real, and we will assume that this 
has been done. For example in su(3) we have already seen that ay = (2, 0), 
ay = (1, y/3), a v = (-1, 

Here are the basic properties and ideas that emerge from this process: 

i) Since oti{e a ,hj) = (&d(hi)e a ,hj) = —(e a ,[hi,hj\) = we see that 
(hi,e a ) = 0. 

ii) Similarly, we see that (a, + /3j)(e Q , ep) = 0, so the e a are orthogonal to 
one another unless a + f3 = 0. Since our Lie algebra is semisimple, and 
consequently the Killing form non-degenerate, we deduce that if a is a 
root, so is —a. 

iii) Since the Killing form is non-degenerate, yet the hj are orthogonal to 
all the e a , it must also be non-degenerate when restricted to the Cartan 
algebra. Thus the metric tensor, = (hi, hj), must be invertible with 
inverse g lj . We will use the notation a • ft to represent ai(3jg lj . 

iv) If a, (5 are roots, then the Jacobi identity shows that 

[hi,[e a ,ep]] = (oi + Pi)[e a ,ep\, 

so if [e a , ep] is non-zero then a + f3 is also a root, and [e Q , ep] oc e a+ p. 

v) It follows from iv), that [e a , e_ a ] commutes with all the hi, and since f) 
was assumed maximal, it must either be zero or a linear combination 
of the hi. A short calculation shows that 

(hi, [e a , e_ J) Oii(e a , e_ Q ), 

and, since (e a , e_ a ) does not vanish, [e a , e_ a ] is non-zero. Thus 

[e a , C—a] ~^~hi = h a 

where a 1 = g^a.j, and h a obeys 

[h a , e ±a ] = ±2e± a . 

The h a are called the co-roots. 

vi) The importance of the co-roots stems from the observation that the 
triad h a , e± a obey the same commutation relations as 03 and o~±, and 
so form an su(2) subalgebra of q. In particular h a (being the analogue 
of 2 J 3 ) has only integer eigenvalues. For example in su(3) 



[T + ,T_] = /i T = A 3 , 
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[v + ,v_\ = h v = U 3 +y^-A 8 , 



and in the denning representation 



hi 



h 



v 



h 




u 



have eigenvalues ±1. 

vii) Since 

ad (h a )ep = [h a , e p \ = ^J~ ep, 

we conclude that 2a ■ (3 /a 2 must be an integer for any pair of roots a, 
P- 

viii) Finally, there can only be one e a for each root a. If not, and there 
were an independent e' a , we could take linear combinations so that e_ a 
and e' a are Killing orthogonal, and hence [e_ Q ,e^] = a l hi(e- a ,e f a ) = 0. 
Thus ad (e_ a )e / a = 0, and e' a is killed by the step-down operator. It 
would therefore be the lowest weight in some su(2) representation. At 
the same time, however, ad (h a )e' a = 2e' a , and we know that the lowest 
weight in any spin J representation cannot have positive eigenvalue. 

The conditions that 

2a- (3 



a 2 



e z 



for any pair of roots tightly constrains the possible root systems, and is the 
key to Cartan and Killing's classification of the semisimple Lie algebras. For 
example the angle 8 between any pair of roots obeys cos 2 6 = n/4 so 9 can 
take only the values 0°, 30°, 45°, 60°, 90°, 120°, 135°, 150°, or 180°. 
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These constraints lead to a complete classification of possible root systems 
into the infinite families 

A n , n=l,2,---. 5l(n+l,C), 

B n , n = 2,3,---. so(2n+l,C), 

C n , n = 3,3, sp(2n,C), 

D n , n = 4, 5, so(2n,C), 

together with the root systems G 2 , F4, E$, E 7 , and E 8 of the exceptional 
algebras. The latter do not correspond to any of the classical matrix groups. 
For example G 2 is the root system of q 2 , the Lie algebra of the group G 2 of 
automorphisms of the octonions. This group is also the subgroup of SL(7) 
preserving the general totally antisymmetric trilinear form. 

The restrictions on n's are to avoid repeats arising from "accidental" 
isomorphisms. If we allow n = 1,2,3, in each series, then C± — D ± — A 1 . 
This corresponds to sp(2, C) = so(3, C) = sl(2, C). Similarly D 2 = A 1 + A x , 
corresponding to isomorphism SO(4) = SU(2) x SU(2)/Z 2 , while C 2 = B 2 
implies that, locally, the compact Sp(2) = SO (5). Finally D 3 = A 3 implies 
that SU(4)/Z 2 = SO(6). 

6.3.4 Product Representations 

Given two representations and of g, we can form a new representa- 
tion that exponentiates to the tensor product of the corresponding represen- 
tations of the group G. Motivated by the result of exercise 5.13: 

exp(A <g) I n + I m <g) B) = exp(A) <g> exp(B) 

we set 

A^ 2 )=A( 1 )®/(2) + /(i) 0A f). 

Then 

[A^ 2) ,Af 2) ] = ([aJ 1) ®/W + /W®A{ 2) ),(a5 1) ®/W + /W 
= [AW,A«]®/( 2 ) + [A«,/W]®Af 

+AW®[/( 2 ),Af]+/W®[Af\Af] 



(6.132) 
(6.133) 

®Af )] 
(6.134) 
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showing that the A 4 also obey the Lie algebra. 

This process of combining representations is analogous to the addition 
of angular momentum in quantum mechanics. Perhaps more precisely, the 
addition of angular momentum is an example of this general construction. 
If representation has weights m\ , i.e. h\ \m^) = ml \m^), and 
has weights mf\ then, writing \m^,m^) for |m^) © \m^}, we have 

fc?® 2) |m (1 W 2) > = {h? ) ®\ + \®h? ) )\m( 1 \m®) 

= (mf } +mf ) )|m (1) ,m (2) ) (6.135) 

so the weights appearing in the representation A l - 1(Xl2 ' ) are mf + mf . 

The new representation is usually decomposible. We are familiar with 
this decomposition for angular momentum where, if j > f , 

3 ® f = U + f) © (j + f ~ 1) © • • • [j - /)■ (6.136) 

This can be understood from adding weights. For example consider adding 
the weights of j = 1/2, which are m = ±1/2 to those of j = 1, which are 
m = -1,0, 1. We get m = -3/2, -1/2 (twice) +1/2 (twice) and m = 3/2. 
These decompose as shown in figure 6.6. 

g — (e) — (e) — o = g — e — e — o o — o 
Figure 6.6: The weights for 1/2 <g> 1 = 3/2 © 1/2. 

The rules for decomposing products in other groups are more compli- 
cated than for SU(2), but can be obtained from weight diagrams in the same 
manner. In SU(3), we have, for example 

3® 3 = 1©8, 
3©8 = 3©6©15, 

8©8 = 1©8©8©10©I0©27. (6.137) 

To illustrate the first of these we show, in figure 6.7 the addition of the 
weights in 3 ) to each of the weights in the 3. 
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Figure 6.7: Adding the weights of 3 and 3. 



The resultant weights decompose (uniquely) into the weight diagrams for the 
8 together with a singlet. 



6.3.5 Sub-algebras and branching rules 

As with finite groups, a representation that is irreducible under the full Lie 
group or algebra will in general become reducible when restricted to a sub- 
group or sub-algebra. The pattern of the decomposition is again called a 
branching rule. Here we provide some examples to illustrate the ideas. 

The three operators V± and hy = |A 3 + -^A 8 of su(3) form a Lie sub- 
algebra that is isomorphic to su(2) under the map that takes them to a± 
and (T3 respectively. When restricted to this sub-algebra, the 8 dimensional 
representation of su(3) becomes reducible, decomposing as 



We can visualize this decomposition coming about by first projecting the 
(A3, As) weights to the "m" of the \j,m) labelling of su(2) as 



8 = 3©2©2© 1, 



(6.138) 



where the 3, 2 and 1 are the j 



1, \ and representations of su(2). 



m = 




A. 



■8 



(6.139) 



and then stripping off the su(2) irreps as we did when decomposing product 
represent ions. 
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This branching pattern occurs in the strong interactions where the mass 
of the strange quark s being much larger than that of the light quarks u and 
d causes the octet of pseudo-scalar mesons, which would all have the same 
mass if SU(3) flavour symmetry was exact, to decompose into the triplet of 
pions 7r + ,7T° and 7r~, the pair K + and K°, their antiparticles K~ and K°, 
and the singlet rj. 

There are obviously other su(2) sub-algebras consisting of {T±, Ht} and 
{U±,hu}, each giving rise to similar decompositions. These sub-algebras, 
and a continuous infinity of related ones, are obtained from the {V±,hv} 
algebra by conjugation by elements of SU(3). 

Another, unrelated, su(2) sub-algebra consists of 



a + ~ V2(U + + T + ), 
o_ ~ v / 2(C/-+T_), 

a 3 ~ 2h v = (A 3 + V3A S ). (6.140) 



The factor of two between the assignment cr 3 ~ hy of our previous example 
and the present assignment 03 ~ 2hy has a non-trivial effect on the branching 
rules. Under restriction to this new subalgebra, the 8 of su(3) decomposes 
as 



8 = 5©3 



(6.141) 
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Figure 6.9: The projection and decomposition for 8 = 5 © 3. 

where the 5 and 3 are the j = 2 and j = 1 representations of su(2). A clue 
to the origin and significance of this sub-algebra is found by noting that the 
3 and 3 representations of su(3) both remain irreducible, but project to the 
same j — 1 representation of su(2). Interpreting this j = 1 representation 
as the defining vector representation of so (3) suggests (correctly) that our 
new su(2) sub-algebra is the Lie algebra of the SO (3) subgroup of SU(3) 
consisting of SU(3) matrices with real entries. 

6.4 Further Exercises and Problems 

Exercise 6.25: Campbell-Baker-Hausdorff Formulae. Here are some useful 
formula for working with exponentials of matrices that do not commute with 
each other. 

a) Let X and X be matrices. Show that 



e tx Ye -tx = y + ^ y] + - t 2 [X, [X, Y]]-\ , 



the terms on the right being the series expansion of exp[ad(iX)]Y. 
b) Let X and 5X be matrices. Show that 




1 + 8X--[X, SX] + - [X, [X, 5X}} + 



+ o [(SXf] 




(6.142) 



c) By expanding out the exponentials, show that 

e X e Y = e X+r+i[X,y]+higher 
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where "higher" means terms higher order in X, Y. The next two terms 
are, m fact, ^ [X, [X, Y]] + j2 [ Y , \Y, X}}- You wil1 find the general formula 
in part d). 

d) By using the formula from part b), show that that e x e Y can be written 
as e z , where 



Z = X+ C g{e 3A ^e 3d ^)Ydt. 
Jo 



Here 

In z 

9{z) = 



1 - 1/z 

has a power series expansion 

g(z) = l + ±(z-l) + ±(z-l) 2 + ±(z-lf + -~, 

which is convergent for \z\ < 1. Show that g(e ad ( x ^ e ad ^ tY ^) can be ex- 
panded as a double power series in ad(X) and ad(tY), provided X and 
Y are small enough. This ad(X), ad(tY) expansion allows us to evaluate 
the product of two matrix exponentials as a third matrix exponential 
provided we know their commutator algebra. 

Exercise 6.26: SU(2) Disentangling theorems: Almost any 2x2 matrix can 
be factored (Gaussian decomposition) as 

a b\ _ t\ a\(\ 0\ (\ 
c d) ~ \0 I J \0 fi) \(3 1 

Use this trick to work the following problems: 
a) Show that 



exp | - (e l(p a + — e ^<r_) j = exp(air + ) exp(AiT3) exp(/3a_), 

where a± = (p\ ± 102) /2, and 

a = e i4> tan 9/2, 
A = -In cos 6/2, 
13 = -e-** tan 6/2. 

b) Use the fact that the spin-^ representation of SU(2) is faithful, to show 
that 



exp <j ^ ( e J + - e~ l<p J-) } = exp(aJ+) exp(2AJ 3 ) exp(/?J_), 
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where J± = J\ ± iJi. Take care, the reasoning here is subtle! Notice 
that the series expansion of exponentials of a± truncates after the second 
term, but the same is not true of the expansion of exponentials of the J± . 
You need to explain why the formula continues to hold in the absence of 
this truncation. 

Exercise 6.27: Invariant tensors for SU(3). Let A, be the Gell-Mann lambda 
matrices. The totally antisymmetric structure constants, f^k, and a set of 
totally symmetric constants dijk are defined by 

fijk = M^'i ^fc])> = ^k})- 

Let Dfj(g) be the matrices representing SU(3) in "8" — the eight-dimensional 
adjoint representation. 

a) Show that 

f ijk = Df t (g)Dj m {g)Dl n (g)fi mn , 

dijk = Dil(g)Djm(g)Dkn(9)dlmn, 

and so f^k and d^k are invariant tensors in the same sense that 5ij and 
%...»„ are invariant tensors for SO(n). 

b) Let Wi = fijkUjVk- Show that if Ui — >■ Df-(g)u k and Vi — »■ D^-(g)v k , then 
u>i — > Dfj(g)wk- Similarly for t«j = dij k UjV k . (Hint: show first that 
the D s matrices are real and orthogonal.) Deduce that fijk and dijk are 
Clebsh-Gordan coefficients for the 8 © 8 part of the decomposition 

8®8 = 1©8©8©10©T0©27. 

c) Similarly show that 5 a p and the lambda matrices (\i) a p can be regarded 
as Clebsch-Gordan coefficients for the decomposition 

3©3 = 1©8. 

d) Use the graphical method of plotting weights and peeling off irreps to 
obtain the tensor product decomposition in part b). 
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Chapter 7 

The Geometry of Fibre Bundles 



In earlier chapters we have used the language of bundles and connections, but 
in a relatively casual manner. We deferred proper mathematical definitions 
until now, because, for the applications we meet in physics, it helps to first 
have acquired an understanding of the geometry of Lie groups. 



7.1 Fibre Bundles 

We begin with a formal definition of a bundle and then illustrate the defini- 
tion with examples from quantum mechanics. These allow us to appreciate 
the physics that the definition is designed to capture. 



7.1.1 Definitions 

A smooth bundle is a triple (E, ir, M) where E and M are manifolds, and 
7r : E — > M is a smooth map. The manifold E is called the total space, M 
is the base space and ix the projection map. The inverse image ix~ x (x) of a 
point in M (i.e. the set of points in E that map to x in M), is the fibre over 
x. 

We usually require that all fibres be diffeomorphic to some fixed manifold 
F. The bundle is then a fibre bundle, and F is "the fibre" of the bundle. In 
a similar vein, we sometimes also refer to the total space E as "the bundle." 
Examples of possible fibres are vector spaces (in which case we have a vector 
bundle), spheres (in which case we have a sphere bundle), and Lie groups. 
When the fibre is a Lie group we speak of a principal bundle. A principal 
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bundle can be thought of the parent of various associated bundles, which are 
constructed by allowing the Lie group to act on a fibre. A bundle whose fibre 
is a one dimensional vector space is called a line bundle. 

The simplest example of a fibre bundle consists of setting E equal to the 
Cartesian product M x F of the base space and the fibre. In this case the 
projection just "forgets" the point / G F, and so 7r : (x, f) h- > x. 

A more interesting example can be constructed by taking M to be the 
circle S l , and F as the one-dimensionsional interval I = [—1,1]. We can 
assemble these ingredients to make E into a Mobius strip. We do this by 
gluing the copy of I over 9 = 2tt to that over 6 = with a half twist so that 
the end —1 G [—1,1] is attached to +1, and vice versa. 



+ 1 







-1 











E 

V 


A 








-e- 





-1 



+1 
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Figure 7.1: Mobius strip bundle, together with a section (j>. 



A bundle that is a product E = M x F, is said to be trivial. The Mobius 
strip is not a Cartesian product, and is said to be a twisted bundle. The 
Mobius strip is, however, locally trivial in that for each x G M there is an 
open retractable neighbourhood U C M of x in which E looks like a product 
U x F. We will assume that all our bundles are locally trivial in this sense. If 
{Ui} is a cover of M (i.e. if M = [jUi) by such retractable neighbourhoods, 
and F is a fixed fibre, then a bundle can be assembled out of the collection 
of Ui x F product bundles by giving gluing rules that identify points on the 
fibre over x G Ui in the product Ui x F with points in the fibre over x G Uj 
in Uj x F for each x G UiDUj. These identifications are made by means of 
invertible maps (fiUiU- (x) : F —>■ F that are defined for each x in the overlap 
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Ui fl Uj. The <^c/i(7 are known as transition functions. They must satisfy the 
consistency conditions 



A section of a fibre bundle (E, n, M) is a smooth map (p : M E such 
that 4>(x) lies in the fibre ix~ l (x) over x. Thus n o <fi = Identity. When the 
total space E is a product M x F this is simply a function <fi : M — > F '. 
When the bundle is twisted, as is the Mobius strip, then the section is no 
longer a function as it takes no unique value at the points x above which 
the fibres are being glued together. Observe that in the Mobius strip the 
half-twist forces the section <p(x) to pass through G [—1,1]- The Mobius 
bundle therefore has no nowhere- zero globally defined sections. Many twisted 
bundles have no globally defined sections at all. 

7.2 Physics Examples 

We now provide three applications where the bundle concept appears in 
quantum mechanics. The first two illustrations are re-expressions of well- 
known physics. The third, the geometric approach to quantization, is perhaps 
less familiar. 

7.2.1 Landau levels 

Consider the Schrodinger eigenvalue problem 



for a particle moving on a flat two-dimensional torus. We think of the torus 
as a La; x L y rectangle with the understanding that as a particle disappears 
through the right-hand boundary it immediately re-appears at the point with 
the same y co-ordinate on the left-hand boundary; similarly for the upper 
and lower boundaries. In quantum mechanics we implement these rules by 
imposing periodic boundary conditions on the wave function: 



(PUiUjix^UjUkix) 



Identity, 



fUiU.ix), x e Ui n Uj n U k ^ 0. 



(7.1) 




(7.2) 



^(0, y) = i>(L x , y) ij(x, 0) = ^{x, L y ). (7.3) 
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These conditions make the wavefunction a well-defined and continuous func- 
tion on the torus, in the sense that after pasting the edges of the rectangle 
together to make a real toroidal surface the function has no jumps, and each 
point on the surface assigns a unique value to ip. The wavefunction is a 
section of an untwisted line bundle with the torus as its base-space, the fi- 
bre over (x, y) being the one-dimensional complex vector space C in which 
ip(x,y) takes its value. 

Now try to carry out the same program for a particle of charge e moving in 
a uniform magnetic field B perpendicular to the x — y plane. The Schrodinger 
equation becomes 

-L (k - ieA -) * - L (I - ,eA >) * = m < {7A) 

where (A x , A y ) is the vector potential. We at once meet a problem. Although 
the magnetic field is constant, the vector potential cannot be chosen to be 
constant — or even periodic. In the Landau gauge, for example, where we set 
A x = 0, the remaining component becomes A y = Bx. This means that as the 
particle moves out of the right-hand edge of the rectangle representing the 
torus we must perform a gauge transformation that prepares it for motion 
in the (A x , A y ) field it will encounter when it reappears at the left. If (7.4) 
holds, then it continues to hold after the simultaneous change 

4>(x,y) e- ieBL *y,p(x,y) 

-ieA y -> -ieA y + e -iBL x y^_ e +ieBL x y = _ ie ( Ay _ BL% ^ (7.5) 

At the right-hand boundary x = L x this gauge transformation resets the 
vector potential A y back to its value at the left-hand boundary. Accordingly, 
we modify the boundary conditions to 

ij(0, V) = e~ ieBL ^(L x , y), i;(x, 0) = ^(x, L y ). (7.6) 

The new boundary conditions make the wavefunction into a section 1 of a — it 
twisted line bundle over the torus. The fibre is again the one- dimensional 
complex vector space C. 

: That the wave "function" is no longer a function should not be disturbing. 
Schrodinger's ip is never really a function of space-time. Seen from a frame moving at 
velocity v, ip(x,t) acquires factor of exp(— imvx — mv 2 t/2), and this is no way for a self- 
respecting function of x and t to behave. 
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We have already met the language in which the gauge field —ieA^ is a 
called connection on the bundle, and the associated ieB field is the curvature. 
We will explain how connections fit into the formal bundle language in section 



The twisting of the boundary conditions by the gauge transformation 
seems innocent, but within it lurks an important constraint related to the 
consistency conditions in (7.1). We can find the value of ip(L x , L y ) from that 
of -0(0, 0) by using the relations in (7.6) in the order ij)(0, 0) — > i/j(0, L y ) — > 
ip(L x ,L y ), or in the order ijj (0,0) — > ip(L x ,0) — * ip(L x ,L y ). Since we must 
obtain the same ip(L x ,L y ) whichever route we use, we need to satisfy the 
condition 



This tells us that the Schrodinger problem makes sense only when the mag- 
netic flux BL x L y through the torus obeys 



for some integer N. We cannot continuously vary the flux through a fi- 
nite torus. This means that if we introduce torus boundary conditions as a 
mathematical convenience in a calculation, then physical effects may depend 
discontinuously on the field. 

The integer N counts the number of times the phase of the wavefunction 
is twisted as we travel from x = L x , y = to x = L x , y = L y gluing the 
right-hand edge wavefunction to back to the left-hand edge wavefunction. 
This twisting number is a topological invariant. We have met this invariant 
before, in section 4.6. It is the first Chern number of the wavefunction 
bundle. If we permit B to become position without altering the total twist N, 
then quantities such as energies and expectation values can change smoothly 
with B. If iV is allowed to change, however, the these quantities may jump 
discontinuously. 

The energy E = E n solutions to (7.4) with boundary conditions (7.6) are 
given by 



7.3. 




(7.7) 



eBL x L y = 2irN 



(7.8) 




(7.9) 



Here if) n (x) is a harmonic-oscillator wavefunction obeying 



1 d 2 ij, 



n 



(7.10) 



2m dx 2 
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with uj = eB/m the classical cyclotron frequency, and E n = u(n + 1/2). The 
parameter k takes the values 2itq/L y for q an integer. At each energy E n we 
obtain N independent eigenfunctions as q runs from 1 to eBL x L y /2ir. These 
iV-fold degenerate states are the Landau levels. The degeneracy, being of 
necessity an integer, provides yet another explanation for why the flux must 
be quantized. 



7.2.2 The Berry connection 

Suppose we are in possession of a quantum-mechanical hamiltonian H(£) de- 
pending on some parameters £ = £ 2 , . . .) G M, and know the eigenstates 
\n; £) that obey 

tf(£)|n;£) = £ n (£)|n;£>. (7-11) 

If, for fixed n, we can find a smooth family of eigenstates |n;£), one for 
every £ in the parameter space M, we have a vector bundle over the space 
M. The fibre above £ is the one-dimensional vector space spanned by \n; £). 
This bundle is a sub-bundle of the product bundle M x 7i where 7i is the 
Hilbert space on which H acts. Although the larger bundle is not twisted, 
the sub-bundle may be. It may also not exist: if the state \n; £) become 
degenerate with another state \m; £) at some value of £, then both states 
can vary discontinuously with the parameters, and we wish to exclude this 
possibility. 

In the previous paragraph we considered the evolution of the eigenstates 
of a time-independent Hamiltonian as we varied its parameters. Another, 
more physical, evolution is given by solving the time- dependent Schrodinger 
equation 

id t \i>(t)) = H(t(t))\m) (7-12) 

so as to follow the evolution of a state \i>{t)) as the parameters are slowly var- 
ied. If the initial state | -0(0) ) coincides with with the eigenstate |0, £(0)), and 
if the time evolution of the parameters is slow enough, then is expected to 
remain close to the corresponding eigenstate |0; £(£)) of the time- independent 
Schrodinger equation for the hamiltonian H(£(t)). To determine exactly how 
"close" it stays, insert the expansion 

|^)> = $> n (t)|n;£(t)>exp{-* f #o(£(*)) dt) . (7.13) 
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into (7.12) and take the inner-product with |m;£). For m ^ 0, we expect 
that the overlap (m; £\if)(t)) will be small and of order 0(d£/dt). Assuming 
that this is so, we read off that 

d + ao(0;^|0;0-^- = 0, (m = 0) (7.14) 

a m = iao— — ^ — "or, (™ ^ 0) (7.15) 
up to first-order accuracy in time derivatives of the \n;£(t)). Hence 

L m^O m 

where the dots refer to terms of higher order in time derivatives. 

Equation (7.16) constitutes the first two terms in a systematic adiabatic 
series expansion. The factor a (t) = exp{i7 Ber ry(^)} is the solution of the 
differential equation (7.14). The angle 7Berr y is known as Berry's phase after 
the British mathematical physicist Michael Berry. It is needed to take up the 
slack between the arbitrary ^-dependent phase choice at our disposal when 
defining the |0; £), and the specific phase selected by the Schrodinger equation 
as it evolves the state \ip(t)). Berry's phase is also called the geometric phase 
because it depends only on the Hillbert-space geometry of the family of states 
|0;£), and not on their energies. We can write 

ft Q£» 

7Be rry (t) =i J (0; £ | d,\ 0; dt (7.17) 
and regard the one-form 

ABerry = (0;W0;Od^=(0^\d\0;O (7.18) 

as a connection on the bundle of states over the space of parameters. The 
equation 

e^ + A Bciry ^ = (7.19) 

then identifies the Schrodinger time evolution with parallel transport. It 
seems reasonable to refer to this particular parallel transport as "Berry trans- 
port." 



264 



CHAPTER 7. THE GEOMETRY OF FIBRE BUNDLES 



In order for corrections to the approximation \i/)(t)) ~ (phase) 1 0; £(t)) to 
remain small, we need the denominator (E m — Eq) to remain large when 
compared to its numerator. The state that we are following must therefore 
never become degenerate with any other state. 

Monople bundle 

Consider, for example a spin-1/2 particle in a magnetic field. If the field 
points in direction n, the Hamiltonian is 

H(n) = n\B\ & ■ n (7.20) 

There are are two eigenstates with energy E± = ±/i\B\. Let is focus on 
the eigenstate \ip + ) corresponding to E + . For each n we can obtain an E + 
eigenstate by applying the projection operator 

P = ±(I + n.<r) = ±( l+Uz n *~ in y\ (7.21) 
2 V ; 2 \n x + vn y 1 - n z J y ' 

to almost any vector, and then multiplying by a real normalization constant 
Af. Applying P to a "spin-up" state, for example gives 

Here 9 and 4> are spherical polar angles on S 2 that specify the direction of n. 

Although the bundle of E = E + eigenstates is globally defined, the family 
of states \xjj+\n)) that we have obtained, and would like to use as base for 
the fibre over n, becomes singular when n is in the vicinity of the south pole 
9 = 7i. This is because the factor e 1 ^ is multivalued at the south pole. There 
is no problem at the north pole because the ambiguous phase e 1 ^ multiples 
sin#/2, which is zero there. 

Near the south pole, however, we can project from a "spin-down" state 
to find. 

This family of eigenstates is smooth near the south pole, but is ill-defined at 
the north pole. As in section 4.6, we are compelled to cover the sphere S 2 
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by two caps D + and D_, and use i n D+ and in £)_. The two 

families are related by 

|^ 1) (n)> = e < *|^ 2) (n)> (7.24) 

in the cingular overlap region D + n D_ . Here e l< ^ is the transition function 
that glues the two families of eigenstates together. 
The Berry connections are 

A$ = {^\d\^) = ^(costf - l)d(t> 

= (^MlV'f) = ^(cos0 + l)#. (7.25) 

In their common domain of definition, they are related by a gauge transfor- 
mation 

Af = A ( l ] + id<j>. (7.26) 
The curvature of either connection is 

dA = - % - sm6d6d<f) = -^d(Area). (7.27) 

The curvature being the area two-form tells us that when we slowly change 
the direction of B and bring it back to its original orientation the spin state 
will, in addition to the dynamical phase exp{— iE + t}, have accumulated a 
phase equal to (minus) one-half of the area enclosed by the trajectory of n 
on S 2 . The two-form field dA can be though of as the flux of a magnetic 
monople residing at the centre of the sphere. The bundle of one-dimensional 
vector spaces span[|^ + (n))] over S 2 is therefore called the monople bundle. 

7.2.3 Quantization 

In this section we provide a short introduction to geometric quantization. 
This idea, due largely to Kirilov, Kostant and Souriau, extends the famil- 
iar technique of canonical quantization to phase spaces with more structure 
than that of the harmonic oscillator. We illustrate the formalism by quan- 
tizing spin, and show how the resulting Hilbert space provides an example of 
the Borel-Weil-Bott construction of the representations of a semi-simple Lie 
group as spaces of sections of holomorphic line bundles. 
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Prequantization 

The passage from classical mechanics to quantum mechanics involves re- 
placing the classical variables by operators in such a way that the classical 
Poisson-bracket algebra is mirrored by the operator commutator algebra. In 
general, this process of quantization is not possible without making some 
compromises. It is, however, usually possible to pre-quantize a phase-space 
with its associated Poisson algebra. 

Let M be a 2n-dimensional classical phase-space with its closed symplec- 
tic form uo. Classically a function / : M — > R give rise to a Hamiltonian 
vector field Vf via Hamilton's equations 

df = -i Vf u. (7.28) 

We saw in section 2.4.2 that the closure condition du — ensures that that 
the Poisson bracket 

{f,9}=Vf9 = u(v f ,v g ) (7.29) 

obeys 

[ v f> v g\ = v if,9,}- ( 7 - 30 ) 
Now suppose that the cohomology class of (2irh)~ 1 uj in H 2 (M, R) has the 
property that its integrals over cycles in H 2 (M, Z) are integers. Then (it can 
be shown) there exists a line bundle L over M with curvature F = —ih~ l uj. 
If we locally write uj = dr), where r\ = rj^dx^ 1 , then the connection one-form 
is A = —ihr x r\ and the covariant derivative 

W v = v^-ih-\), (7.31) 

acts on sections of the Line bundle. The corresponding curvature is 

F(u, v)= [V u , V,] - V [uM = -ih-'uiu, v). (7.32) 

We define a pre-quantized operator p(f) that acting on sections ty(x) of 
the line bundle corresponds to the classical function /: 

p(f) = -ihV Vf + /. (7.33) 

For hamiltonian vector fields vj and v g we have 

[HV Vf + if, V„J = KV [vf:Vg] - iu>(v f , v g ) + i[f, V„J 
= ^[v f ,v g ] - i(iv f w + df)(v g ) 
= W[v f ,v g ], (7.34) 
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and so 

[-ihV Vf + f, -ihV Vg + g] = -h 2 V [vf:Vg] - ihv f g 

= -ih(-ihV[ VftVg ] + {f,g}) 

= -ih(-ihV V{fg} +{f,g}). (7.35) 

Equation (7.35) is Dirac's quantization rule: 

i[p(f),P(9)\ = m{f,9})- (7-36) 

The process of quantization is completed, when possible, by defining a 
polarization. This is a restriction on the variables that we allow the wave- 
functions to depend on. For example, if there is a global set of Darboux 
co-ordinates p, q we may demand that the wavefunction depend only on q, 
or only on the combination p + iq. Such a restriction is necessary so that 
the representation / i— > p(f) is irreducible. Since globally defined Darboux 
co-ordinates do not usually exist, this step is the hard part of quantization. 

The precise definition of a polarized section is rather complicated. We 
can only sketch it here, but give a concrete example in the next section. At 
each point x G M the symplectic form defines a skew bilinear form. We seek 
a Lagrangian subspace of V x C TM p for this form. A Lagrangian subspace 
is one such that V x = Vj~. For example, if 

uj = dpi A dqi + dpi A dq 2 , (7.37) 

then the space spanned by the c^'s is Lagrangian, as is the space spanned by 
the <9p's. We allow the coefficients of the vectors in V x to be complex numbers. 
The vectors fields spanning the V x s form a distribution. We require it to be 
integrable, so that the V x are the tangent spaces to a global foliation of M. 
A section ^ of the Line bundle is polarized if = for all (Gl4. 

We define an inner product on the space of polarized sections by using 
the Liouville measure u n / n\ on the phase space. The quantum Hilbert space 
then consists of finite-norm polarized sections of L. Only classical functions 
that give rise to polarization-compatible vector fields will have their Poisson- 
bracket algebra coincide with the quantum commutator algebra. 

Quantizing spin 

To illustrate these ideas, we quantize spin. The classical mechanics of spin 
was discussed in section 2.4.2. There we showed that the appropriate phase 
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space is the 2-sphere equipped with a symplectic form proportional to the 
area form. Here we must be specific about the constant of proportionality. 
We choose units in which h — > 1, and take uj = j d(Area). The integrality of 
uj/2tc requires that j be an integer or half integer. We will assume that j is 
positive. 

We parametrize the 2-sphere with complex sterographic co-ordinates z, 
z which are constructed similarly to those in section 3.4.3. This choice will 
allow us to impose a natural complex polarization on the wavefunctions. In 
contrast to section 3.4.3, however, it is here convenient to make the point 
z = correspond to the south pole, so the polar co-ordinates 9, 0, on the 
sphere are related to z, z via 

cos 9 - 
e^sm9 = 



\z 


j 2 -l 


\z 


l 2 + l' 




2z 


\z\ 


2 + l' 




2z 


\z\ 


2 + l' 



e _< *sin0 = — 2 ■ - . (7.38) 
In terms of the z, z co-ordinates 

u = (1 + 2 ^ |2)2 dz A dz. (7.39) 
As long as we avoid the north pole where z = oo, we can write 

u = d{ij Z f~ Z .f Z \=d n , (7.40) 



1 + \z. 

and so the local connection form has components proportional to 

~z z 

Vz = -ij I |2 I i ' Vz = U | |2 i -i • ( 7 - 41 ) 
\Zf + 1 \z\ 2 + 1 

The covariant derivatives are therefore 

V* = r^—r , V-= J=+j / (7.42) 

|^| 2 + 1 \z\ 2 + 1 

We impose the polarization condition that V- 1 !/ = 0. This condition 
requires the allowed sections to be of the form 

*(z,z) = (1 + \z\ 2 )~ j ^(z), (7.43) 
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where depends only on z. It is natural to combine the (1 + \z\ 2 ) J prefactor 
with the Liouville measure so that the inner product becomes 

The normalizable wavefunctions are then polynomials in z of degree less than 
or equal to 2j, and a complete ortho normal set is given by 



'' (j — m)\(j + my. 

We desire to find the quantum operators p(Ji) corresponding to the com- 
ponents 

Ji = j sin 9 cos 0, J 2 = j sin sin 0, J 3 = j cos 9, (7.46) 

of a classical spin J of magnitude j, and also to the ladder-operator compo- 
nents J± — J± ± « J 2 . In our complex co-ordinates these functions become 



■h = j 
J + = j 



\z 


j 2 -l 


\z 


l 2 + l' 




2z 


\z\ 


2 + l' 




2z 


\z\ 


2 + l' 



J = 3T-^-r- (7.47) 



Hamilton's equations read 



, (1 + \z\ 2 ) 2 dH 

1 27 1%' 

. (l + \z\ 2 f dH 
~ l 2j dz ' 



(7.48) 



and the Hamiltonian vector fields corresponding to the classical phase space 
functions J 3 , J + and J_ are 

vj 3 = izd z - izdz, 
v J+ = -iz 2 d z - idg, 

vj_ = idz + iz 2 ^. (7.49) 
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Using the recipe (7.33) for p(H) from the previous section, and the fact 
that V-\l/ = 0, we find, for example, that 



p(j + )(i + \z\ 2 r^(z) 



+ 



2jz 



= a 



dz (i + i^i 2 ); (i + N 2 ) 
2 9 



(i + \z\ 2 y^(z), 



I 2 W 



dz 



1> 



(7.50) 



It is natural to define operators 

j; = (i + i^i 2 )w,)(i + i^i 2 )^ 



(7.51) 



that act only on the ^-polynomial part ip(z) of the section *&(z, z). We then 
have 

- d 
J+ = -z 2 — + 2jz. 
dz 



Similarly, we find that 



J. 



d_ 

dz' 

? 9 ■ 

Js = Z d-z~ 3 - 



These operators obey the su(2) Lie algebra relations 

[J 3 ,J±] =±J±, 
[J + ,Z] = 2J 3 , 

and act on the ip m (z) monomials as 



(7.52) 



(7.53) 
(7.54) 



(7.55) 



■h^ m {z) = mip m (z) 

J±^m(z) = + 1) - m(m± l)ip m±1 (z). 



(7.56) 



This is the familiar action of the su(2) generators on \ j, m) basis states. 
Exercise 7.1: Show that with respect to the inner product (7.44) we have 



J3 — J 3 , J_L — J- . 



'3, J\ 
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Coherent states and the Borel-Weil-Bott theorem 

We now explain how the spin wavefunctions ip m (z) can be understood as 
sections of a holomorphic line bundle. 

Suppose that we have a compact Lie group G and a unitary irreducible 
representation g G G i— > D J (g). Let |0) be the normalized highest (or lowest) 
weight state in the representation space. Consider the states 



The \g) compose a family of generalized coherent states. 2 There is a contin- 
uous infinity of the \g), and so they cannot constitute an orthonormal set on 
the finite dimensional representation space. The matrix-element orthogonal- 
ity property (6.79), however, provides us us with a useful over- completeness 
relation 



The integral is over all of G, but many points in G give the same contri- 
bution. The maximal torus T is the abelian subgroup of G obtained by 
exponentiating elements of the Cartan algebra. Because any weight vector is 
a common eigenvector of the Cartan algebra, elements of T leave |0) fixed up 
to a phase. The set of distinct \g) in the integral can therefore be identified 
with G/T. This coset space is always an even dimensional manifold, and 
thus a candidate phase space. 



Consider in particular the spin-j representation of SU(2). The coset space 
G/T is then SU(2)/[/(l) ~ S 2 . We can write a general element of SU(2) as 



for some complex parameters ~z, 6 and 7 which are functions of the three real 
co-ordinates that parameterize SU(2). We let U act on the lowest-weight 
state \j, —j). The rightmost factor has no effect on the lowest weight state, 
and the middle factor only multiplies it by a constant. We therefore restrict 
our attention to the states 



z) = exp(zJ + )\j, -j), (z\ = (j, - 3 \exp(zJ_) = (\z)Y . (7.60) 



\g) = D J (g)\0), (g\ = (0\[D J (g)] 



(7.57) 




(7.58) 



U = exp(zJ + ) exp(#J 3 ) exp(7 J_) 



(7.59) 



2 A. Pcrelomov, Generalized Coherent States and their Applications, (Springer- Verlag, 
Berlin 1986). 
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These states are not normalized, but have the advantage that the (z\ are 
holomorphic in the parameter z — i.e. they depend on z, but not on z. 

The set of distinct \z) can still be identified with the 2-sphere, and z, z 
are its complex sterographic co-ordinates. This identification is an example 
of a general property of compact Lie groups: 

G/T = G c /B + . (7.61) 

Here Gc is the complexification of G — the group G, but with its parameters 
allowed to be complex — and B + is the Borel group whose Lie algebra consists 
of the Cartan algebra together with the step-up ladder operators. 
The inner product of two \z) states is 

(z'\z) = (l + zz') 2j , (7.62) 

and the eigenstates \j, m) of J 2 and J 3 possess coherent state wavefunctions 



^\z) = (z\j,m) = J- 23 } z> +m . (7.63) 

y {j -m)\(j +m)\ 

We recognize these as our spin wavefunctions from the previous section. 
The over-completeness relation can be written as 

2j + l f dzAdz 

\z)(z\, (7.64) 



2ni J (l + zz) 2 i+ 2 

and provides the inner product for the coherent-state wavefunctions. If 

ip(z) = (z\ip) and x(z) = (z\x) then 



zz )2 j+ 2 

2j + 1 f dz A dz 
2iri 



f dzAdz , . . . . 

J 7TT^ iiz)x(:) ' (7 - 65) 



which coincides with (7.44). 

The wavefunctions ipm{z) are singular at the north pole where z = oo. 
Indeed there is no actual state (oo| because the phase of this putative limiting 
state would depend on the direction from which we approach the point at 
infinity. We may, however, define a second family of coherent states 

|C> 2 = exp(CJ-)|j,j>, 2 <C| = 0',j|exp(CJ + ), (7-66) 
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and form the wavefunctions 

V4 2) (C)=2<Cb',™>. (7.67) 

These new states and wavefunctions are well defined in the vicinity of the 
north pole, but singular near the south pole. 

To find the relation between V^(C) an d ip^(z) we note that the matrix 
identity 

"0 -1 
1 

coupled with the faithfulness of the spin-| representation of SU(2), implies 
the relation 

wexp(zJ + ) = exp (-z' 1 J_)(-z) 2J:i exp (z' 1 J+), (7.69) 
where w = exp(— in J 2 ). We also note that 

0', j\w = {-lf ] (j, -j | , (j, -j\w = (j, j 

Thus, 

^?(*) = (j,-j\e zJ -\j,m) 

= (-l) 2i 0',J>e zJ -|j,m) 
= {-l)*tiJ\e-r 1 '-(- z ) 2J 'er 1J +\j,m) 
= (-l) 2i (-z)*<j,j\e'- 1 '+\j, m ) 
= z 23 ^\z~ 1 )- (7-71) 

The transition function z 2j that relates ipm\z) to ipm(( = \jz) depends only 
on z. We therefore say that the wavefunctions ipm (z) and ipm (C) are the local 
components of a global section ip m <-> \j,m) of a holomorphic line bundle. 
The requirement that the transition function and its inverse be holomorphic 
and single valued in the overlap of the z and ( coordinate patches forces 2j 
to be an integer. The ip m form a basis for the space of global holomorphic 
sections of this bundle. 

Borel, Weil and Bott showed that any finite-dimensional representation of 
a semi-simple Lie group G can be realized as the space of global holomorphic 
sections of a line bundle over Gc/B + . This bundle is constructed from the 



1 

z 1 



1 
1 1 



-z 



1 





(7.68) 



(7.70) 
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highest (or lowest) weight vectors in the representation by a natural gener- 
alization of the method we have used for spin. This idea has been extended 
by Ed Witten and others to infinite dimensional Lie groups, where it can be 
used, for example, to quantize two-dimensional gravity. 

Exercise 7.2: Normalize the states \z), (z\, by multiplying them by N = (1 + | 2 ) . 
Show that 

U|2 _ i 

N 2 (z\J 3 \z) =j \ 

\z\ z + 1 

2z 



N 2 (z\J + \z)=j- 
N 2 (z\J-\z)=j 



2 + l' 

2z 



z\ 2 + l 



thus confirming the identification of z, z with the complex stereographic co- 
ordinates on the sphere. 



7.3 Working in the Total Space 

We have mostly considered a bundle to be a collection of mathematical ob- 
jects attached to a base space, rather than treating the bundle as a geometric 
object in its own right. In this section we will demonstrate the advantages 
to be gained from the latter viewpoint. 

7.3.1 Principal Bundles and Associated bundles 

The fibre bundles that arise in a gauge theory with Lie group G are called 
principal G- Bundles, and the fields and wavefunctions are sections of associ- 
ated bundles. A principal G-bundle comprises the total space, which we here 
call P, together with the projection, ir, to the base space M. The fibre can 
be regarded as a copy of G 

tv-.P^M, ti- 1 (x) = G. (7.72) 

Strictly speaking, the fibre is only required to be a homogeneous space on 
which G acts freely and transitively on the right; x — > xg. Such a set can 
be identified with G after we have selected a fiducial point /o € F to be 
the group identity. There is no canonical choice for f and, if the bundle is 
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twisted, there can be no globally smooth choice. This is because a smooth 
choice for /o in the fibres above an open subset U C M makes P locally 
into a product U x G. Being able to extend U to the entirety of M means 
that P is trivial. We will, however, make use of local assignments / i— > e 
to introduce bundle co-ordinate charts in which P is locally a product, and 
therefore parametrized by ordered pairs (x, g) with x G U and g E G. 

To understand the bundles associated with P, it is simplest to define the 
sections of the associated bundle. Let tpi(x,g) be a function on the total 
space P with a set of indices % carrying some representation g \— > -D(g) of 
G. We say that ipi(x,g) is a section of an associated bundle if it varies in a 
particular way as we run up and down the fibres by acting on them from the 
right with elements of G. We require 



These sections can be thought of as wavefunctions for a particle moving in 
a gauge field on the base space. The choice of representation D plays the 
role of "charge," and (7.73) are the gauge transformations. Note that we 
must take h^ 1 as the argument of D in order for the transformation to be 
consistent under group multiplication: 



The construction of the associated bundle itself requires rather more ab- 
straction. Suppose that the matrices D(g) act on the vector space V. Then 
the total space P v of the associated bundle consists of equivalence classes 
of P x V under the relation ((x, <?),v) ~ ((x, gh), D{h~ l )\) for all v G V, 
(x, g) G P and h G G. The set of G-action equivalence classes in a Cartesian 
product A x B is usually denoted hy Ax G B. Our total space is therefore 



<Pi(x,gh) = Dij{h ^ip^x.g). 



(7.73) 



(Pi{x,ghih 2 ) 



Dijih^Djkih^iptix, g) 
D ik (h 2 l h^ l )tp k (x,g) 
Dihiih-Ji^Y^ipkix, g). 



(7.74) 



Py = PX G V. 



(7.75) 



We find it conceptually easier to work with the sections as defined above, 
rather than with these equivalence classes. 
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7.3.2 Connections 

A gauge field is a connection on a principal bundle. The formal definition of 
a connection is a decomposition of the tangent space TP P of P at p e P into 
a horizontal subspace H. p (P) and a vertical subspace V P (P). We require that 
Vp(P) be the tangent space to the fibres and H P (P) to be a complementary 
subspace, i.e., the direct sum should be the whole tangent space 

TP P = H P (P)®V P (P). (7.76) 

The horizontal subspaces must also be invariant under the push-forward 
induced from the action on the fibres from the right of a fixed element 
of G. More formally, if R[g] : P — > P acts to take p — > pgr, i.e. by 
P[^](x, = (x,g'g), we require 

PM,P P (P)=P P9 (P). (7.77) 

Thus, we get to chose one horizontal subspace in each fibre, the rest being 
determined by the right-invariance condition. 

Given a curve x(t) in the base space we can, by solving the equation 

g + -frA^g = 0, (7.78) 

lift it to a curve (x(t),g(t)) in the total space, whose tangent is everywhere 
horizontal. This lifting operation corresponds to parallel transporting the 
initial value g(0) along the curve x(t) to get g(t). The A^ = i\ a A a ^ are a set of 
Lie-algebra-valued functions that are determined by our choice of horizontal 
subspace. They are defined so that the vector (Sx, —A^Sx^g) is horizontal 
for each small displacement 5x^ in the tangent space of M. Here —A^Sx^g is 
to be understood as the displacement that takes g — > (1 — A^x^g. Because 
we are multiplying A in from the left, the lifted curve can be slid rigidly 
up and down the fibres by the right action of any fixed group element. The 
right-invariance condition is therefore automatically satisfied. 
The directional derivative along the lifted curve is 

where R a is a right-invariant vector field on G, i.e., a differential operator on 
functions defined on the fibres. The are a set of vector fields in TP. These 
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covariant derivatives span the horizontal subspace at each point p G P, and 
have Lie brackets 

[XV, V v \ = -F%R a . (7.80) 

Here J 7 ^, is given in terms of the structure constants appearing in the Lie 
brackets [R a , R b ] = f^ b R c by 

T% = d,Al - d v A c , - f c ab AlA b v . (7.81) 

We can also write 

J> = dpAv - d v A^ + [Ap, A v ] . (7.82) 

where = i\ a T^ v and [A a , A 6 ] = if c ah \ c . 

Because the Lie bracket of the T>^ is a linear combination of the R a , it lies 
entirely in the vertical subspace. Consequently, when ^ 0, the are not 
in involution, and Frobenius' theorem tells us that the horizontal subspaces 
cannot fit together to form the tangent spaces to a smooth foliation of P. 

We make contact with the more familiar definitions of covariant deriva- 
tives by remembering that right invariant vector fields are derivatives that 
involve infinitesimal multiplication from the left. Their definition is 

R a tPi{x, g) = lim - ((fii(x, (1 + ie\ a )g) - ipi(x, g)) , (7.83) 

where [A a , X b ] = i/£,A c . 

Since <fi(x, g) is a section of the associated bundle, we know how it varies 
when we multiply group elements in on the right. We therefore write 

{l + ie\ a )g = gg-\l + ie\ a )g, (7.84) 

and from this, (and writing g for D(g) where it makes for compact notation) 
we find 

R a ^i{x,g) = lim (Aj(# _1 (l -ie\a)g)(pj{x,g) - <Pi(x,g)^ /e 
= -Dij{g~ l ){i\a)jkD k i{g)ipi{x,g) 

= -i(g- 1 X a g) ij (p j . (7.85) 

Here i(\ a )ij is the matrix representing the Lie algebra generator i\ a in the 
representation g i— > D{g). Acting on sections, we therefore have 

= (d^) g + (g-U^tp. (7.86) 
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This still does not look too familiar because the derivatives with respect to 
are being taken at fixed g. We normally fix a gauge by making a choice of 
g = cr(x) for each x^. The conventional wavefunction (p(x) is then tp(x, cr(x)). 
We can use <p(x,a(x)) = a _1 (:r)(/3(:r, e), to obtain 

= (d,<p) a + (d.a- 1 ) a<p = (d,<p) a - {a-%a) <p. (7.87) 

From this we get a derivative 

V M = <9 M + (a- x Apa + a'%a) =d^ + Ap. (7.88) 

on functions (p(x) = <p(x,a(x)) defined (locally) on the base space M. This 
is the conventional covariant derivative, now containing gauge fields A^(x) 
that are gauge transformations of our ^-independent A^. The derivative has 
been constructed so that 

VM*) = V M^9)\ g=a{x) i (7-89) 

and has commutator 

[V M ,V„] =<7- 1 J>(T = F IU/ . (7.90) 

Note the sign change vis-a-vis equation (7.80). 

It is the curvature tensor F^ v that we have met previously. Recall that it 
provides a Lie algebra valued two-form 

F = X -F iXV dx> x dx v = dA + A 2 (7.91) 

on the base space. The connection A = A^dx^ is a one-form on the base 
space, and both F and A have been defined only in the region U C M where 
the smooth gauge-choice section a(x) has been selected. 

7.3.3 Monople harmonics 

The total-space operations and definitions seem rather abstract. We demon- 
strate their power by solving the Schrodinger problem for a charged particle 
confined to a unit sphere surrounding a magnetic monopole. The conven- 
tional approach to this problem involves first selecting a gauge for vector 
the potential A, which, because of the monopole, is necessarily singular at a 
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Dirac string located somewhere on the sphere, and then delving into prop- 
erties of Gegenbauer polynomials. Eventually we find the gauge-dependent 
wavefunction. By working with the total space, however, we can solve the 
problem in all gauges at once, and the problem becomes a simple exercise in 
Lie group geometry. 

Recall that the SU(2) representation matrices D J mn {d, (ft, ift) form a com- 
plete ortho normal set of functions on the group manifold S* 3 . There will be a 
similar complete orthonormal set of representation matrices on the manifold 
of any compact Lie group G. Given a subgroup H G G, we will use these 
matrices to construct bundles associated to a principal iJ-bundle that has G 
as its total space, and the coset space G/H as its base space. The fibres will 
be copies of H, and the projection tt the usual projection G — > G/H. 

The functions D J (g) are not in general functions on the coset space 
G/H as they depend on the choice of representative. Instead, because of 
the representation property, they vary with the choice of representative in a 
well-defined way, 

D J mn (gh) = D J mnl (g)D J n , n (h). (7.92) 

Since we are dealing with compact groups, the representations can be taken 
to be unitary and 

[D J mn (gh)Y = [D J mnl {g)y\D J nln {h)T (7.93) 
= D J nnl {h- l )[D J mnl {g)r ■ (7.94) 

This is the correct variation under the right action of the group H for the 
set of functions [D^^gh)]* to be sections of a bundle associated with the 
principal fibre bundle G — > G/H. The representation h h- > D(h) of H is not 
necessarily that defined by the label J because irreducible representations of 
G may be reducible under H; D depends on what representation of H the 
index n belongs to. If D is the identity representation, then the functions 
are functions on G/H in the ordinary sense. For G = SU(2) and H the U(l) 
subgroup generated by J3, the quotient space is just S 2 , and projection is the 
Hopf map: S* 3 — > S 2 . The resulting bundle can be called the Hopf bundle. 
It is not a really new object however, because it is a generalization of the 
monopole bundle of the preceding section. Parameterizing SU(2) with Euler 
angles, so that 

D J mn( e A^) = {J,m\e-^ j3 e~ i9j2 e~^ j3 \J,n}, (7.95) 
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shows that the Hopf map consists of simply forgetting about ip, so 

Hopf : [(9, 0, V) e S 3 ] i-> [(0, 0) G S 2 ]. (7.96) 

The bundle is twisted because S 3 is not a product S 2 x S 1 . Taking n = gives 
us functions independent of ip, and we obtain the well-known identification 
of the spherical harmonics with representation matrices 

Y^OA) = ^^P2(M,0)]*. (7.97) 

For n = A ^ we get sections of a bundle with Chern number 2A. These 
sections are the monopole harmonics 

KlW, 1>) = ^^^-[DiJB, 0, i;)]* (7.98) 

for a monopole of flux J eB rf(Area) = Air A. The integrality of the Chern 
number tells us that the flux AttA must be an integer multiple of 2ir. This 
gives us a geometric reason for why the eigenvalues m of J 3 can only be an 
integer or half integer. 

The monopole harmonics have a non-trivial oc e l ^ A dependence on the 
choice we make for -0 at each point on S 2 , and we cannot make a globally 
smooth choice; we always encounter a point where there is a singularity. 
These sections of the twisted bundle have to be constructed in patches and 
glued together transition functions. 

We now show that the monopole harmonics are eigenfunctions of the 
Schrodinger operator, —V 2 , containing the gauge field connection, just as the 
spherical harmonics are eigenfunctions of the Laplacian on the sphere. This 
is a simple geometrical exercise. Because they are irreducible representations, 
the D J (g) are automatically eigenfunctions of the quadratic Casimir operator 

(J 2 + J 2 + Jl)D\g) = J(J + l)D\g). (7.99) 

The J, can be either right or left-invariant vector fields on G; the quadratic 
Casimir is the same second-order differential operator in either case, and it 
is a good guess that it is proportional to the Laplacian on the group mani- 
fold. Taking a locally geodesic co-ordinate system (in which the connection 
vanishes) confirms this: J 2 = —V 2 on the three-sphere. The operator in 
(7.99) is not the Laplacian we want, however. What we need is the V 2 on 
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the two-sphere S 2 = G/H, including the the connection. This V 2 operator 
differs from the one on the total space since it must contain only differential 
operators lying in the horizontal subspaces. There is a natural notion of or- 
thogonality in the Lie group, deriving from the Killing form, and it is natural 
to choose the horizontal subspaces to be orthogonal to the fibres of G/H. 
Since multiplication on the right by the subgroup generated by J3 moves 
one up and down the fibres, the orthogonal displacements are obtained by 
multiplication on the right by infinitesimal elements made by exponentiating 
J 1 and J 2 . The desired V 2 is thus made out of the left-invariant vector fields 
(which act by multiplication on the right), J± and J 2 only. The wave operator 
must be 

-V 2 = J 2 + J\ = J 2 - J 3 2 . (7.100) 

Applying this to the y^-h we see that they are eigenfunctions of —V 2 on S* 2 
with eigenvalues J(J + 1) — A 2 . The Laplace eigenvalues for our flux 4-7rA 
monopole problem are therefore 

Ej, m = (J(J+ 1) - A 2 ), J>|A|, -J<m<J. (7.101) 

The utility of the monopole Harmonics is not restricted to exotic monopole 
physics. They occur in molecular and nuclear physics as the wavefunctions 
for the rotational degrees of freedom of diatomic molecules and uniaxially 
deformed nuclei that possess angular momentum A about their axis of sym- 
metry. 3 

Exercise 7.3: Compare these energy levels for a particle on a sphere with those 
of the Landau level problem on the plane. Show that for any fixed flux the 
low-lying energies remain close to E = (eB/m part i c \ e )(n + 1/2), n zero or a 
positive integer, but their degeneracy is is equal to the number of flux units 
penetrating the sphere plus one. 

7.3.4 Bundle connection and curvature forms 

Recall that in section 7.3.2 we introduced the Lie-Algebra- valued functions 
Afj,(x). We now use these functions to introduce the bundle connection form 
A that lives in T*P. We set 

A = A li daf (7.102) 

3 This is explained, with chararacteristic terseness, in a footnote on page 317 of Landau 
and Lifshitz' Quantum Mechanics (Third Edition). 
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and 

A d ^ f g- l (A + 5gg- l )g. (7.103) 

In these definitions, x and g are the local co-ordinates in which points in 
the total space are labelled as (x,g), and d acts on functions of x, and the 
"<5" is used to denote the exterior derivative acting on the fibre. 4 We have, 
then, that 5x^ = and dg = 0. The combinations Sgg^ 1 and g~ 1 5g are 
respectively the right- and left-invariant Maurer-Cartan form on the group. 

The complete exterior derivative in the total space requires us to differen- 
tiate both with respect to g and with respect to x, and is given by d tot = d+5. 
Because d 2 , 5 2 and (d + 5) 2 = d 2 + 5 2 + d5 + 5d are all zero, we must have 

5d + d5 = 0. (7.104) 

We now define the bundle curvature form in terms of A to be 

F = d tot A + A 2 

To compute F in terms of A(x) and g we need the ingredients 

dA = g~\dA)g, (7.106) 

and 

5A = -(g~ 1 Sg)A - A(g~ l 5g) - (g~ l 5g) 2 . (7.107) 

We find that 

F = (d + 5)A + A 2 = g' 1 (dA + A 2 ) g 

= 9~ X Tg, (7.108) 

where 

T =-T^d^dx\ (7.109) 

and 

J> = dpA v - d v A^ + [Ap, A v \. (7.110) 

Although we have defined the connection form A in terms of the local 
bundle co-ordinates (x,g), it is, in fact, an intrinsic quantity, i.e. it is has a 
global existence independent of the choice of these co-ordinates. A has been 
constructed so that 



(7.105) 



4 It is not therefore to be confused with the Hodge 5 — S operator. 
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• A vector is annihilated by A if and only if it is horizontal. In particular 
A(T>fj) = for all covariant derivatives T>^. 

• The connection form is constant on /e/i-invariant vector fields on the 
fibres. In particular A(L a ) = i\ a . 

Between them, the globally defined fields £> M G H P (P) and L a e V P (P) span 
the tangent space TP p . Consequently the two properties listed above tell us 
how to evaluate A on any vector, and so define it uniquely and globally. 

From the globally defined and gauge invariant A and its associated cur- 
vature F, and for any local gauge-choice section a : (U C M) — > P, we can 
recover the gauge-dependent base-space forms A and F as the pull-backs 

A = a* A, F = a*¥, (7.111) 

to U C M of the total-space forms. The resulting forms are 

A = (a^A^a + cr^d^a) dx», F = - {o^T^o) dx"dx u , (7.112) 

and coincide with the equations connecting A^ with A^ and F^ u with 
that we obtained in section 7.3.2. We should take care to note that the dx^ 
that appear in A and F are differential forms on M, while the dx^ that 
appear in A and JF are differential forms on P. Now the projection ir is a left 
inverse of the gauge-choice section a, i.e. it o a = identity. The associated 
pull-backs are also inverses, but with the order reversed: a* o n* = identity. 
These maps relate the two sets of "<ix M " by 

dx^M = a* {dx»\ P ) , or dx»\ P = n* (dx^M) ■ (7.113) 

We now explain the advantage of knowing the total space connection and 
curvature forms. Consider the Chern character oc trF 2 on the base-space 
M. We can use the bundle projection n to pull this form back to total space. 
From 

V = {ga-^F^ga- 1 ), (7.114) 

we find that 

n* (trF 2 ) =trF 2 . (7.115) 

Now A, F and d tot have the same calculus properties as A , F and d. The 
manipulations that give 



trF 2 = dtr (AdA + ^A 3 
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also show, therefore, that 



trF 2 = d tot tr Ad tot A + -A 3 



(7.116) 



There is a big difference in the significance of the computation, however. The 
bundle connection A is globally defined. Consequently, the form 



is also globally defined. The pull-back to the total space of the Chern char- 
acter is dtot exact! This miracle works for all characteristic classes: on the 
base-space they are exact only when the bundle is trivial; on the total space 
they are always exact. 

We have seen this phonomenon before, for example in exercise 6.7. The 
area form d[Area] = sin dOd/fi is closed but not exact on S 2 . When pulled 
back to S 3 by the Hopf map, the area form becomes exact: 



7.3.5 Characteristic classes as obstructions 

The generalized Gauss-Bonnet theorem states that, for a compact orientable 
even-dimensional manifold M, the integral of the Euler class over M is equal 
to the Euler character x(M). Shiing-Shen Chern used the exactness of the 
pull-back of the Euler class to give an elegant intrinsic proof 5 of this theorem. 
Chern showed that the integral of the Euler class over M was equal to the 
sum of the Poincare-Hopf indices of any tangent vector field on M, a sum 
we independently know to equal the Euler character x(M)- We illustrate his 
strategy by showing how a non-zero ch 2 (-F) provides a similar index sum for 
the singularities of any section of an SU(2)-bundle over a four-dimensional 
base space. This result provides an interpretation of characteristic classes as 
obstructions to the existence of global sections. 

Let a : M — > P be a section of an SU(2) principal bundle P over a 
four-dimensional compact orientable manifold M without boundary. For 
any SU(n) group we have chi(F) = 0, but 




(7.117) 



Hopf*<i[Area] = sin d0d<f> = d(- cos 6d<j) + dip). 



(7.118) 




(7.119) 



5 S-J. Chern, Ann. Math. 47 (1946) 85-121. This paper is a readable classic. 
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can be non-zero. 

The section a will, in general, have points X{ where it becomes singular. 
We punch infinitesimal holes in M surrounding the singular points. The 
manifold M' = (M \ holes) will have as its boundary DM' a disjoint union 
of small three-spheres. We denote by £ the image of M' under the map 
a : M' — > P. This £ will be a submanifold of P, whose boundary will be 
equal in homology to a linear combination of the boundary components of 
M' with integer coefficients. We show that the Chern number n is equal to 
the sum of these coefficients. 

We begin by using the projection n to pull back ch 2 (F), to the bundle, 
where we know that 

7r*ch 2 (F) = -g^dtot w 3 (A). (7.120) 

Now we can decompose ou 3 (A) into terms of different bi-degree, i.e. into 
terms that are p-forms in d and g-forms in 5. 

u 3 (A) =ul + u\ + ul + u>q. (7.121) 

Here the superscript counts the form-degree in 5, and the subscript the form- 
degree in d. The only term we need to know explicitly is Uq. This comes 
from the g~ 1 8g part of A, and is 



uj 



tr [{g- 1 5g)5{g- 1 5g) + 2 -{g- 1 5gf 
= tr (-(g-'Sgf + ^g-'Sg)^ 

= —(g-'Sg) 3 . (7.122) 

We next use the map a : M' — > P to pull the right-hand side of (7.120) 
back from P to M'. We recall that acting on forms on M' we have a* o n* = 
identity. Thus 

I ch 2 (F)= I ch 2 (F) = [ <j*on*ch 2 (F) 

■3-2 / ^*^tot^3(A) 

^ Jm> 

d tot u 3 (A) 



8tt 2 
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8tt 



2 



w 3 (A) 



nr. 



24tt 2 



/ {g-Hgf. (7.123) 
J as 



At the first step we have observed that the omitted spheres make a negligeable 
contribution to the integral over M, and at the last step we have used the 
fact that the boundary of E, has significant extent only along the fibres, 
so all contributions to the integral over <9E come from the purely vertical 
component of 0*3 (A), which is Uq = —^(g^dg). 

We know (see exercise 6.8) that for maps g 1— > U G SU(2) we have 



/ 



tr (g 1 dg) 3 = 2Ati 2 x winding number 
We conclude that 

/ ch 2 (F) = -L y {g-Hgf = N t (7.124) 

^ ^ singularities Xi 

where JVj is the Brouwer degree of the map a : S 3 — * SU(2) = S* 3 on the 
small sphere surrounding Xj. 

It turns out that for any SU(n) the integral of tr (g~ l Sg) 3 is 247T 2 times 
an integer winding number of g about homology spheres. The second Chern 
number of a SU(n)-bundle is therefore also equal to the sum of the winding- 
number indices of the section about its singularities. Chern's strategy can 
be used to relate other characteristic classes to obstructions to the existence 
of global sections of appropriate bundles. 

7.3.6 Stora-Zumino descent equations 

In the previous sections we met the forms 

A = g- 1 Ag + g- 1 5g (7.125) 

and 

A — a' 1 Act + a -1 da. (7.126) 

The group element g labeled points on the fibres and was independent x, 
while a(x) was the gauge-choice section of the bundle and depended on x. 
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The two quantities A and A look similar, but are not identical. A third 
superficially similar but distinct object is met with in the BRST (Becchi- 
Rouet-Stora-Tyutin) approach to quantizing gauge theories, and also in the 
geometric theory of anomalies. We describe it here to alert the reader to the 
potential for confusion. 

Rather than attempting to define this new differential form rigorously, 
we will first explain how to calculate with it, and only then indicate what it 
is. We begin by considering a fixed connection form A on M, and its orbit 
under the action of the group Q of gauge transformations. This elements of 
this infinite dimensional group are maps g : M — > G equipped with pointwise 
product g\g2{x) = g\(x)g2(x). This g(x) is neither the fibre co-ordinate g, 
nor the gauge choice section a(x). The gauge transformation g(x) acts on A 
to give A 9 where 

A° = g- 1 Ag + g- 1 dg. (7.127) 
We now introduce an object 

v(x) = g-'Sg, (7.128) 

and consider 

21 = A 9 + v = g~ x Ag + g~ x dg + g^Sg. (7.129) 

This 1-form appears to be a hybrid of the earlier quantities, but we will 
see that it has to be considered as something new. The essential difference 
from what has gone before is that we want v to behave like g _1 Sg, in that 
8v = —v 2 , and yet to depend on x. In particular we want 5 to behave as 
an exterior derivative that implements an infinitesimal gauge transformation 
that takes g — > g + 8g. Thus, 

Sig-'dg) = -{g-Hg){g- 1 dg)+g- 1 5dg 

= -(g-'Sg^g-'dg) - (jg^dg^Sg) + {g~ 1 dg)(g~ 1 dg) - g^dSg 
= -v(g' 1 dg)-(g- 1 dg)v-dv, (7.130) 

and hence 

SA 9 = -vA 9 - A 9 v - dv. (7.131) 

Previously g~ 1 dg = 0, and so there was no u dv" in <5(gauge field). 
We can define a curvature associated with 21 



d=d tot ^ + ^ 2 , 



(7.132) 
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and compute 

3 = (d+S)(A 9 + v) + (A 9 + vf 

= dA 9 + dv + 5 A 9 + 5v + (A 9 ) 2 + A 9 v + vA 9 + v 2 

= dA 9 + {A 9 ) 2 

= g-'Fg, (7.133) 

Stora calls (7.133) the Russian formula. 

Because # is yet another gauge transform of F, we have 

trF 2 = tr# 2 = (d + 5) tr (%(d + 5)21 + ^2l 3 ^) (7.134) 

and can decompose the right-hand side into terms that are simultaneously 
p-foms in d and g-forms in 5. 

The left hand side, tr# 2 = trF 2 , of (7.134) is independent of v. The right 
hand side of (7.134) contains ^(Sl) which we expand as 

oj 3 (A 9 + v)= lu° 3 (A 9 ) + u\(v, A 9 ) + lu 2 (v, A 9 ) + u 3 (v). (7.135) 

As in the previous section, the superscript counts the form-degree in 5, and 
the subscript the form-degree in d. Explicit computation shows that 

iv° 3 (A 9 ) = tr{A 9 dA 9 + \{A 9 ) 3 ) , 
ul(v,A°) = tr{vdA 9 ), 
iv 2 (v,A 9 ) = -tr(A 9 v 2 ), 

u*(y) = -|v 3 (7.136) 

For example, 

u 3 (v) = tr (v 5v + |u 3 ^ = tr (v(-v 2 ) + ^v 3 ^j = -^v 3 . (7.137) 

With this decomposition, (7.116) falls apart into the chain of descent equa- 
tions 

trF 2 = du° 3 (A 9 ), 

5u° 3 (A 9 ) = -du>l(v,A*), 

8u\[y,A a ) = -du 2 (v,A 9 ), 

8u 2 (v,A 9 ) = -du 3 {v), 

5lu 3 (v) = 0. (7.138) 
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Let us verify, for example, the penultimate equation 5u>f(v, A 9 ) = —du^v). 
The left-hand side is 

-5tr(A 9 v 2 ) = -ti(-Av 3 -vA 9 v 2 - dvv 2 ) = tr(dvv 2 ), (7.139) 

the terms involving A 9 having cancelled via the cyclic property of the trace 
and the fact that A 9 anticommutes with v. The right-hand side is 

-d (-|trw 3 ) = tr {dvv 2 ) (7.140) 

as required. 

The descent equations were introduced by Raymond Stora and Bruno Zu- 
mino as a tool for obtaining and systematizing information about anomalies 
in the quantum field theory of fermions interacting with the gauge field A 9 . 
The uj 9 (v, A 9 ) are p-forms in the dx^, and before use they are integrated over 
p-cycles in M. This process is understood to produce local functionals of A 9 
that remain g-forms in 5g. For example, in 2n space-time dimensions, the 
integral 

I[g- 1 5g,A 9 ]= [ ^(g'^g, A 9 ) (7.141) 

J M 

has the properties required for it to be a candidate for the anomalous vari- 
ation of the fermion effective action due to an infinitesimal gauge 
transformation g — > g + 5g. In particular, when dM = 0, we have 

5I[g- 1 5g,A 9 ]= [ Su^v, A 9 ) = - [ du^^v, A 9 ) = 0. (7.142) 

J M J M 

This is the Wess-Zumino consistency condition that S(SS) must obey as a 
consequence of 5 2 = 0. 

In addition to producing a convenient solution of the Wess-Zumino condi- 
tion, the descent equations provide a compact derivation of the gauge trans- 
formation properties of useful differential forms. We will not seek to explain 
further the physical meaning of these forms, leaving this to a field theory 
course. 

The similarity between A and 21 lead various authors to attempt to iden- 
tify them, and in particular to identify v(x) with the g _1 Sg Maurer-cartan 
form appearing in A. However the physical meaning of expressions such as 
d(g~ l Sg) precludes such a simple interpretation. In evaluating dv ~ d(g~ 1 Sg) 
on a vector field ^ a (x)L a representing an infinitesimal gauge transformation, 
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we are to first to insert the field into v ~ g~ l $g to obtain the x dependent 
Lie algebra element H; a (x)\ a , and only then to take the exterior derivative 
to obtain iA a <9 M £ a dx 11 . The result therefore involves derivatives of the com- 
ponents Z, a {x). The evaluation of an ordinary differential form on a vector 
field never produces derivatives of the vector components. 

To understand what the Stora-Zumino forms are, imagine that we equip a 
two dimensional fibre bundle E = M x F with base-space co-ordinate x and 
fibre co-ordinate y. A p = 1, q = 1 form on E will then be F = f(x, y) dxSy 
for some function f(x,y). There is only one object 5y, and there is no 
meaning to integrating F over x to leave a 1-form in 8y on E. The space 
of forms introduced by Stora and Zumino, on the other hand, would contain 
elements such as 

J= f j(x,y)dxSy x (7.143) 

J M 

where there is a distinct 5y x for each x G M. If we take, for example, 
j(x,y) = 5'(x — a), we evaluate 3 on the vector field Y(x,y)d y as 

3[Y(x, y)d y ] = J 5'(x - a)Y(x, y) dx = -Y'(a, y). (7.144) 

The conclusion is that that the 1-form form field v(x) ~ g~ 1 Sg must be 
considered as the left-invariant Maurer-Cartan form on the infinite dimen- 
sional Lie group Q, rather than a Maurer-Cartan form on the finite dimen- 
sional Lie group G. The f M u% n (v, A 9 ) are therefore elements of the coho- 
mology group H q (A G ) of the Q orbit of A, a rather complicated object. For 
a thorough discussion see: J. A. de Azcarraga, J. M. Izquierdo, Lie groups, 
Lie Algebras, Cohomology and some Applications in Physics, published by 
Cambridge University Press. 



Chapter 8 
Complex Analysis I 



Although this chapter is called complex analysis, we will try to develop 
the subject as complex calculus — meaning that we shall follow the calculus 
course tradition of telling you how to do things, and explaining why theorems 
are true, with arguments that would not pass for rigorous proofs in a course 
on real analysis. We try, however, to tell no lies. 

This chapter will focus on the basic ideas that need to be understood 
before we apply complex methods to evaluating integrals, analysing data, 
and solving differential equations. 

8.1 Cauchy-Riemann equations 

We focus on functions, f(z), of a single complex variable, z, where z = x + iy. 
We can think of these as being complex valued functions of two real variables, 
x and y. For example 



f(z) = sin z = sin(x + iy) 



sin x cos iy + cos x sin iy 
sin x cosh y + i cos x sinh y. 



(8.1) 



Here, we have used 



cos a; 



SIM = 



I {e lx + e~ ix ) , 




cosh a; 



sinh x 
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to make the connection between the circular and hyperbolic functions. We 
shall often write f{z) = u + iv, where u and v are real functions of x and y. 
In the present example, u = sin x cosh y and v = cosrrsinhy. 
If all four partial derivatives 

du dv dv du 
<9x' dy' dx' dy' 

exist and are continuous then / = u + iv is differentiable as a complex- 
valued function of two real variables. This means that we can approximate 
the variation in / as 

where the dots represent a remainder that goes to zero faster than linearly 
as 5x, 5y go to zero. We now regroup the terms, setting 5z = 5x + iSy, 
5z = 5x — i5y, so that 



where we have defined 



df 1 fdf Of 



%- 



dz 2 \dx dy 
df 1 fdf .df\ 

m " 2\m + %)- (8 - 5) 

Now our function f(z) does not depend on z, and so it must satisfy 

|=0. (8.6) 



' ''ir + iir) (u + iv) = (8.7) 



Thus, with f = u + iv, 

2 \dx ' dy 



i.e. 

'du dv\ . / dv du s 
dx dy J \dx dy / 
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Since the vanishing of a complex number requires the real and imaginary 
parts to be separately zero, this implies that 

du dv 
dx dy' 

dv du !n _ 

ei = ~W < 8 ' 9 ' 

These two relations between u and v are known as the Cauchy-Riemann 
equations, although they were probably discovered by Gauss. If our continu- 
ous partial derivatives satisfy the Cauchy-Riemann equations at zq = xo + iyo 
then we say that the function is complex differentiable (or just differentiate) 
at that point. By taking 5z = z — z , we have 

6f = f(z) - f(z ) = y.( z - Zo ) + ... l (8.10) 

where the remainder, represented by the dots, tends to zero faster than \z— zq\ 
as z — > z . This validity of this linear approximation to the variation in f(z) 
is equivalent to the statement that the ratio 

m^IM (8 . n) 

Z-Zq 

tends to a definite limit as z — > zq from any direction. It is the direction- 
independence of this limit that provides a proper meaning to the phrase 
"does not depend on ~z." Since we are not allowing dependence on z, it is 
natural to drop the partial derivative signs and write the limit as an ordinary 
derivative 

f(z) - f(z ) df 

in 

z- 

We will also use Newton's fluxion notation 



lim Jy ^> = ?L. (8.12) 

" »z Z — Zq dz 



I-/-W. (8-13) 

The complex derivative obeys exactly the same calculus rules as ordinary 
real derivatives: 

d_ 

dz 
d 

— sinz = cos^, 
dz 



-z n = nz T 
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If the function is differentiable at all points in an arcwise-connected 1 open 
set, or domain, D, the function is said to be analytic there. The words regular 
or holomorphic are also used. 

8.1.1 Conjugate pairs 

The functions u and v comprising the real and imaginary parts of an analytic 
function are said to form a pair of harmonic conjugate functions. Such pairs 
have many properties that are useful for solving physical problems. 
From the Cauchy-Riemann equations we deduce that 

d 2 d 2 \ 

+ — )u = 0, 



dx 2 dy 2 
d 2 d 2 

+ £-)v = 0. (8.15) 



dx 2 dy 2 

and so both the real and imaginary parts of f(z) are automatically harmonic 
functions of x, y. 

Further, from the Cauchy-Riemann conditions, we deduce that 

ox ox Oy Oy 

This means that Vw • Vi> = 0. We conclude that, provided that neither 
of these gradients vanishes, the pair of curves u = const, and v = const. 
intersect at right angles. If we regard u as the potential <fi solving some 
electrostatics problem V 2 = 0, then the curves v = const, are the associated 
field lines. 

Another application is to fluid mechanics. If v is the velocity field of an 
irrotational (V x v = 0) flow, then we can (perhaps only locally) write the 
flow field as a gradient 

v x = d x (j>, 

v y = d y <P, (8.17) 

where is a velocity potential. If the flow is incompressible (V ■ v = 0), then 
we can (locally) write it as a curl 

V X = dyX, 

Vy = -d xX , (8.18) 



1 Arcwise connected means that any two points in D can be joined by a continuous path 
that lies wholely within D. 
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where X is a stream function. The curves X — const, are the flow streamlines. 
If the flow is both irrotational and incompressible, then we may use either 
or x to represent the flow, and, since the two representations must agree, we 
have 

d x (f) = +d y x, 

d y <f> = -d xX . (8.19) 

Thus and X are harmonic conjugates, and so the complex combination 
$ = + %x is an analytic function called the complex stream junction. 

A conjugate v exists (at least locally) for any harmonic function u. To 
see why, assume first that we have a (u, v) pair obeying the Cauchy-Riemann 
equations. Then we can write 

, dv dv 

dv = —dx + T—dy 
ox ay 

du , du , . . 

= -Ty dX+ lf/ V - (8 ' 20) 

This observation suggests that if we are given a harmonic function u in some 
simply connected domain D, we can define a v by setting 

r z f du du \ 

for some real constant v(zq) and point Zq. The integral does not depend on 
choice of path from z to z, and so v(z) is well defined. The path indepen- 
dence comes about because the curl 

dy ( dy) dx^ydx^j ^ U (8.22) 

vanishes, and because in a simply connected domain all paths connecting the 
same endpoints are homologous. 

We now verify that this candidate v(z) satisfies the Cauchy-Riemann 
realtions. The path independence, allows us to make our final approach to 
z = x + iy along a straight line segment lying on either the x or y axis. If we 
approach along the x axis, we have 



viz) 



r ( du\ 

J \9y) dx ' + rest ° f integra1 ' ( 8,23 ' ) 
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and may use 

j X f(x',y)dx' = f(x,y) (8.24) 

to see that 

at (x, y) . If, instead, we approach along the y axis, we may similarly compute 

Thus f (z) does indeed obey the Cauchy-Riemann equations. 

Because of the utility the harmonic conjugate it is worth giving a practical 
recipe for finding it, and so obtaining f(z) when given only its real part 
u(x,y). The method we give below is one we learned from John d'Angelo. 
It is more efficient than those given in most textbooks. We first observe that 
if / is a function of z only, then f(z) depends only on z. We can therefore 
define a function / of z by setting f(z) = f(z). Now 

\(m+W)) =u(x,y). (8.27) 

Set 

x=^(z + z), y=^ Z -z), (8.28) 

so 

u(\(z + z), \\z -z))=l (f(z) + l(z)) . (8.29) 

Now set z — 0, while keeping z fixed! Thus 

/W+7(0) = 2«(|,^). (8.30) 

The function / is not completely determined of course, because we can always 
add a constant to v, and so we have the result 

f(z)=2u(^,^)+iC, CeR. (8.31) 
For example, let u = x 2 — y 2 . We find 

/ \z) +7(0) = 2 g) 2 -2 {^)=z\ (8.32) 
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or 



/(*) 



z 2 +iC, C E E. 



(8.33) 



The business of setting setting z = 0, while keeping z fixed, may feel like 
a dirty trick, but it can be justified by the (as yet to be proved) fact that / 
has a convergent expansion as a power series in z = x + iy. In this expansion 
it is meaningful to let x and y themselves be complex, and so allow z and 
z to become two independent complex variables. Anyway, you can always 
check ex post facto that your answer is correct. 

8.1.2 Conformal Mapping 

An analytic function w = f(z) maps subsets of its domain of definition in 
the "z" plane on to subsets in the u w" plane. These maps are often useful 
for solving problems in two dimensional electrostatics or fluid flow. Their 
simplest property is geometrical: such maps are conformal. 



Figure 8.1: An illustration of conformal mapping. The unshaded "triangle" 
marked z is mapped into the other five unshaded regions by the functions 
labeling them. Observe that although the regions are distorted, the angles of 
the "triangle" are preserved by the maps (with the exception of those corners 
that get mapped to infinity). 



z-i 

z 





z 



Suppose that the derivative of f(z) at a point Zq is non-zero. Then, for z 
near zq we have 

f(z)-f(z )^A(z-z ), (8.34) 
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where 




dz 



(8.35) 



If you think about the geometric interpretation of complex multiplication 
(multiply the magnitudes, add the arguments) you will see that the "/" 
image of a small neighbourhood of zq is stretched by a factor \A\, and rotated 
through an angle arg A — but relative angles are not altered. The map z i— > 
f(z) = w is therefore isogonal. Our map also preserves orientation (the sense 
of rotation of the relative angle) and these two properties, isogonality and 
orientation-preservation, are what make the map conformal? The conformal 
property fails at points where the derivative vanishes or becomes infinite. 

If we can find a conformal map z (= x + iy) \— > w (= u + iv) of some 
domain D to another D' then a function f(z) that solves a potential theory 
problem (a Dirichlet boundary-value problem, for example) in D will lead to 
f(z(w)) solving an analogous problem in D' . 

Consider, for example, the map z i— > w = z + e 2 . This map takes the 
strip — oo < x < oo, — 7r < y < n to the entire complex plane with cuts from 
— oo + in to — 1 + in and from — oo — in to — 1 — in. The cuts occur because 
the images of the lines y = ±7r get folded back on themselves at w — — 1 ±i7r, 
where the derivative of w(z) vanishes. (See figure 8.2) 

In this case, the imaginary part of the function f(z) = x + iy trivially 
solves the Dirichlet problem = in the infinite strip, with y — n 

on the upper boundary and y = —n on the lower boundary. The function 
y(u,v), now quite non-trivially, solves V„„y = in the entire w plane, with 
y = 7r on the half-line running from — oo + in to —1 + in, and y = —n on the 
half-line running from — oo — in to —1 — in. We may regard the images of 
the lines y = const, (solid curves) as being the streamlines of an irrotational 
and incompressible flow out of the end of a tube into an infinite region, or as 
the equipotentials near the edge of a pair of capacitor plates. In the latter 
case, the images of the lines x = const, (dotted curves) are the corresponding 
field-lines 

Example: The Joukowski map. This map is famous in the history of aero- 
nautics because it can be used to map the exterior of a circle to the exterior 
of an aerofoil-shaped region. We can use the Milne- Thomson circle theorem 
(see 8.3.2) to find the streamlines for the flow past a circle in the z plane, 

2 If / were a function of z only, then the map would still be isogonal, but would reverse 
the orientation. We call such maps antiholomorphic or anti- conformal. 
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Figure 8.2: Image of part of the strip — n < y < n, — oo < x < oo under the 
map z I— > w = z + e 2 . 
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and then use Joukowski's transformation, 




(8.36) 



to map this simple flow to the flow past the aerofoil. To produce an aerofoil 
shape, the circle must go through the point z — 1, where the derivative of / 
vanishes, and the image of this point becomes the sharp trailing edge of the 
aerofoil. 

The Riemann Mapping Theorem 

There are tables of conformal maps for D, D' pairs, but an underlying prin- 
ciple is provided by the Riemann mapping theorem: 

Theorem: The interior of any simply connected domain D in C whose bound- 
ary consists of more that one point can be mapped conformally one-to-one 
and onto the interior of the unit circle. It is possible to choose an arbitrary 
interior point Wq of D and map it to the origin, and to take an arbitrary 
direction through wq and make it the direction of the real axis. With these 
two choices the mapping is unique. 



This theorem was first stated in Riemann's PhD thesis in 1851. He re- 
garded it as "obvious" for the reason that we will give as a physical "proof." 
Riemann's argument is not rigorous, however, and it was not until 1912 that 
a real proof was obtained by Constantin Caratheodory. A proof that is both 
shorter and more in spirit of Riemann's ideas was given by Leopold Fejer 
and Frigyes Riesz in 1922. 




Figure 8.3: The Riemann mapping theorem. 
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For the physical "proof," observe that in the function 

-^-]nz = —^-{]n\z\+i9}, (8.37) 

2,71 2,71 

the real part = — ^ln|z| is the potential of a unit charge at the origin, 
and with the additive constant chosen so that = on the circle \z\ = 1. 
Now imagine that we have solved the two-dimensional electrostatics problem 
of finding the potential for a unit charge located at wq G D, also with the 
boundary of D being held at zero potential. We have 

V 2 0i = -5 2 (w- w ), 0i = O on 3D. (8.38) 

Now find the 02 that is harmonically conjugate to 0i. Set 

0x + ?0 2 = = \n(ze ia ) (8.39) 

In 

where a is a real constant. We see that the transformation w z, or 

z = e -<a e -2 W *H > ( 8>40 ) 

does the job of mapping the interior of .D into the interior of the unit circle, 
and the boundary of D to the boundary of the unit circle. Note how our 
freedom to choose the constant a is what allows us to "take an arbitrary 
direction through wq and make it the direction of the real axis." 
Example: To find the map that takes the upper half-plane into the unit 
circle, with the point z — i mapping to the origin, we use the method of 
images to solve for the complex potential of a unit charge at w — i: 

01+^02 = —77- (ln(w - i) - \n(w + i)) 

Therefore 

in — 1 

z = e -*«^ . (8.4i) 

We immediately verify that that this works: we have \z\ = 1 when w is real, 
and z = at w = i. 

The difficulty with the physical argument is that it is not clear that a so- 
lution to the point-charge electrostatics problem exists. In three dimensions, 
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for example, there is no solution when the boundary has a sharp inward 
directed spike. (We cannot physically realize such a situation either: the 
electric field becomes unboundedly large near the tip of a spike, and bound- 
ary charge will leak off and neutralize the point charge.) There might well 
be analogous difficulties in two dimensions if the boundary of D is patho- 
logical. However, the fact that there is a proof of the Riemann mapping 
theorem shows that the two-dimensional electrostatics problem does always 
have a solution, at least in the interior of D — even if the boundary is an 
infinite-length fractal. However, unless dD is reasonably smooth the result- 
ing Riemann map cannot be continuously extended to the boundary. When 
the boundary of D is a smooth closed curve, then the the boundary of D 
will map one-to-one and continuously onto the boundary of the unit circle. 

Exercise 8.1: Van der Pauw's Theorem. 3 This problem explains a practical 
method of for determining the conductivity a of a material, given a sample in 
the form of of a wafer of uniform thickness d, but of irregular shape. In practice 
at the Phillips company in Eindhoven, this was a wafer of semiconductor cut 
from an unmachined boule. 




Figure 8.4: A thin semiconductor wafer with attached leads. 

We attach leads to point contacts A, B, C, D, taken in anticlockwise order, on 
the periphery of the wafer and drive a current Iab from A to B. We record the 
potential difference Vb — Vc and so find Rab,dc = (Vb — Vc)/Iab- Similarly 
we measure Rbc,ad- The current flow in the wafer is assumed to be two 
dimensional, and to obey 

J = -(ad)VV, V-J = 0, 

3 L. J. Van der Pauw, Phillips Research Reps. 13 (1958) 1. See also A. M. Thompson, 
D. G. Lampard, Nature 177 (1956) 888, and D. G. Lampard. Proc. Inst. Elec. Eng. C. 
104 (1957) 271, for the "Calculable Capacitor." 
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and n • J = at the boundary (except at the current source and drain). The 
potential V is therefore harmonic, with Neumann boundary conditions. 

Van der Pauw claims that 

exp{-iradR A B,Dc} + exp{-iradR B c,AD} = !■ 

From this ad can be found numerically. 

a) First show that Van der Pauw's claim is true if the wafer were the entire 
upper half-plane with A, B, C, D on the real axis with xa < xb < xq < 
xd- 

b) Next, taking care to consider the transformation of the current source 
terms and the Neumann boundary conditions, show that the claim is 
invariant under conformal maps, and, by mapping the wafer to the upper 
half-plane, show that it is true in general. 



8.2 Complex Integration: Cauchy and Stokes 

In this section we will define the integral of an analytic function, and make 
contact with the exterior calculus from chapters 2-4. The most obvious 
difference between the real and complex integral is that in evaluating the 
definite integral of a function in the complex plane we must specify the path 
along which we integrate. When this path of integration is the boundary of 
a region, it is often called a contour from the use of the word in the graphic 
arts to describe the outline of something. The integrals themselves are then 
called contour integrals. 



8.2.1 The Complex Integral 

The complex integral 

Jf(z)dz (8.42) 
over a path T may be denned by expanding out the real and imaginary parts 



f(z)dz = / (u + iv)(dx + idy) = / (udx — vdy)+i / (vdx + udy). (8.43) 
r Jr Jr Jr 

and treating the two integrals on the right hand side as standard vector- 
calculus line-integrals of the form f v • dr, one with v — > (u, — v) and and one 
with v — > (v, u). 
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Figure 8.5: A chain approximation to the curve V. 



The complex integral can also be constructed as the limit of a Riemann sum 
in a manner parallel to the definition of the real-variable Riemann integral 
of elementary calculus. Replace the path F with a chain composed of of N 
line-segments z -to-zi, zi-to-z 2 , all the way to zjv-i-to-Zjv. Now let £ m lie 
on the line segment joining z m _i and z m . Then the integral j r f(z)dz is the 
limit of the (Riemann) sum 



N 



^ ^ f (Cm) (^m Zm—l) 



(8.44) 



as N gets large and all the 



Zm Z m — 1 



0. For this definition to make 



sense and be useful, the limit must be independent of both how we chop up 
the curve and how we select the points £ m . This may be shown to be the 
case when the integration path is smooth and the function being integrated 
is continuous. 

The Riemann-sum definition of the integral leads to a useful inequality: 
combining the triangle inequality |a + b\ < |a| + |6| with \ab\ = \a\\b\ we 
deduce that 



N 



^ /'(Cm)(^m Zm - 1 ) 



m=l 



N 



— \f{tjrn){Zm ~ z rn-l)\ 

m=l 
N 

= (8-45) 



m=l 



For sufficiently smooth curves the last sum converges to the real integral 
L \f(z)\ \dz\, and we deduce that 



f(z) dz 



< 



\f(z)\\dz\ 



(8.46) 
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For curves T that are smooth enough to have a well-defined length |T|, we 
will have J r \ dz\ = \T\. From this we conclude that if |/| < M on T, then we 
have the Darboux inequality 



f(z) dz 



<M\T\. (8.47) 



We shall find many uses for this inequality. 

The Riemann sum definition also makes it clear that if f(z) is the deriva- 
tive of another analytic function g(z), i.e. 

/M = % (8.48) 
then, for Y a smooth path from z = a to z = b, we have 

f(z)dz = g{b)-g(a). (8.49) 



r 



This follows by approximating /(f m ) « (g(z m ) - g(z m - 1 ))/(z m z m—l)i &nd 
observing that the resultant Riemann sum 

N 

^{gM-gizm-i)) (8.50) 

m=l 

telescopes. The approximation to the derivative will become exact in the 
limit \z m — z m -i\ — > 0. Thus, when f(z) is the derivative of another function, 
the integral is independent of the route that T takes from a to b. 

We shall see that any analytic function is (at least locally) the derivative 
of another analytic function, and so this path independence holds generally 
- provided that we do not try to move the integration contour over a place 
where / ceases to be differentiable. This is the essence of what is known as 
Cauchy's Theorem — although, as with much of complex analysis, the result 
was known to Gauss. 



8.2.2 Cauchy's theorem 

Before we state and prove Cauchy's theorem, we must introduce an orien- 
tation convention and some traditional notation. Recall that a p-chain is a 
finite formal sum of p- dimensional oriented surfaces or curves, and that a 
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p-cycle is a p-chain T whose boundary vanishes: dF — 0. A 1-cycle that con- 
sists of only a single connected component is a closed curve. We will mostly 
consider integrals over simple closed curves — these being curves that do not 
self intersect — or 1-cycles consisting of finite formal sums of such curves. 
The orientation of a simple closed curve can be described by the sense, clock- 
wise or anticlockwise, in which we traverse it. We will adopt the convention 
that a positively oriented curve is one such that the integration is performed 
in a anticlockwise direction. The integral over a chain T of oriented simple 
closed curves will be denoted by the symbol <f r / dz. 

We now establish Cauchy's theorem by relating it to our previous work 
with exterior derivatives: Suppose that / is analytic with a a domain D, so 
that d-f = within D. We therefore have that the the exterior derivative of 
/is 

df = d z fdz + c\fdz = d z fdz. (8.51) 

Now suppose that the simple closed curve Y is the boundary of a region 
VI C D. We can exploit Stokes' theorem to deduce that 

I f{z)dz = [ d{f{z)dz) = [ (d z f) dzAdz = 0. (8.52) 
Jv=on Jn Jn 

The last integral is zero because dz A dz = 0. We may state our result as: 

Theorem (Cauchy, in modern language): The integral of an analytic function 

over a 1-cycle that is homologous to zero vanishes. 

The zero result is only guaranteed if the function / is analytic throughout 

the region f2. For example, if T is the unit circle z = e %e then 

j Q dz = J e~ ie d (e ie ) = i J dd = 2ni. (8.53) 

Cauchy's theorem is not applicable because \ jz is singular, i.e. not differen- 
tiable, at z — 0. The formula (8.53) will hold for T any contour homologous 
to the unit circle in C \ 0, the complex plane punctured by the removal of 
the point z — 0. Thus 

T 



-)dz = 2m (8.54) 
r \ z J 

for any contour V that encloses the origin. We can deduce a rather remarkable 
formula from (8.54): Writing V = dVl with anticlockwise orientation, we use 
Stokes' theorem to obtain 

-)dz= [oh (-) dz A dz = { l™> I * ^ (8.55) 



on 



z Jo \z 
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Since dz A dz = 2idx A dy, we have established that 

This rather cryptic formula encodes one of the most useful results in math- 
ematics. 

Perhaps perversely, functions that are more singular than 1/z have van- 
ishing integrals about their singularities. With Y again the unit circle, we 
have 2 2 

j (J^j dz = J J e~ 2ie d (e ie ) = iJJ e- ie d9 = 0. (8.57) 

The same is true for all higher integer powers: 

1 dz = 0, n > 2. (8.58) 

We can understand this vanishing in another way, by evaluating the in- 
tegral as 



^ dz = i) — I - ) dz 



n 



1 z n ~ x 



0, n^l. 



Tr dz \ n — 1 z n 1 

(8.59) 

Here, the notation [A] r means the difference in the value of A at two ends 
of the integration path T. For a closed curve the difference is zero because 
the two ends are at the same point. This approach reinforces the fact that 
the complex integral can be computed from the "anti-derivative" in the same 
way as the real- variable integral. We also see why 1/z is special. It is the 
derivative of Inz = \n\z\ + i&rgz, and In z is not really a function, as it is 
multivalued. In evaluating [lnz] r we must follow the continuous evolution 
of arg z as we traverse the contour. As the origin is within the contour, this 
angle increases by 2tc, and so 

[\nz] T = [i&Ygz} T = i (arge 2 ^ - arge 0i ) = 2m. (8.60) 

Exercise 8.2: Suppose f(z) is analytic in a simply-connected domain D, and 
zq € D. Set g(z) = f* Q f(z) dz along some path in D from zq to z. Use the 
path-independence of the integral to compute the derivative of g(z) and show 
that 

><*> = §• 

This confirms our earlier claim that any analytic function is the derivative of 
some other analytic function. 
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Exercise 8.3:The "D-bar" problem: Suppose we are given a simply-connected 
domain fi, and a function f(z,~z) denned on it, and wish to find a function 
F(z,z) such that 

dF(z,z) 

m = f(z,z), {z,z)en. 

Use (8.56) to argue formally that the general solution is 

F(C,0 = -- I ^^dxAdy + g(C), 
n Jn z-Q 

where g(Q is an arbitrary analytic function. This result can be shown to be 
correct by more rigorous reasoning. 



8.2.3 The residue theorem 

The essential tool for computations with complex integrals is provided by 
the residue theorem. With the aid of this theorem, the evaluation of contour 
integrals becomes easy. All one has to do is identify points at which the 
function being integrated blows up, and examine just how it blows up. 
If, near the point z^, the function can be written 

f o (i) n (i) n (l) 1 

^ = {(l^F + - + (^ + (^}« w W- W 

where g^\z) is analytic and non-zero at Zj, then f(z) has a pole of order N at 
Zi. If AT = 1 then f(z) is said to have a simple pole at Zj. We can normalize 
g^\z) so that g^(zi) = 1, and then the coefficient, of l/(z — zi) is 
called the residue of the pole at Zj. The coefficients of the more singular 
terms do not influence the result of the integral, but iV must be finite for the 
singularity to be called a pole. 

Theorem: Let the function f(z) be analytic within and on the boundary 
T = dD of a simply connected domain D, with the exception of finite number 
of points at which f(z) has poles. Then 

<j> f(z) dz = ^ 2ni (residue at pole), (8.62) 
poles e D 

the integral being traversed in the positive (anticlockwise) sense. 
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We prove the residue theorem by drawing small circles Cj about each 
singular point z\ in D. 




Figure 8.6: Circles for the residue theorem. 



We now assert that 



£f{z)dz = J2£ fi*)dz, (8.63) 

because the 1-cycle 

C = T-^Ci = dn (8.64) 

i 

is the boundary of a region Q in which / is analytic, and hence C is homol- 
ogous to zero. If we make the radius Ri of the circle C, sufficiently small, we 
may replace each g^'(z) by its limit g^\zi) = 1, and so take 



(i) (i) (i) 

(Z-Zi) [Z-ZiY (z-Zi) N 



(i) (i) (i) 

^1 ^2 



(Z-Zi) (Z-Ztf (Z-Z,;^ 



+ •••+ /, (8-65) 



on Cj. We then evaluate the integral over Cj by using our previous results 
to get 

f(z)dz = 2maf. (8.66) 



I 



The integral around Y is therefore equal to 27rz o^' 
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The restriction to contours containing only finitely many poles arises for 
two reasons: Firstly, with infinitely many poles, the sum over % might not 
converge; secondly, there may be a point whose every neighbourhood contains 
infinitely many of the poles, and there our construction of drawing circles 
around each individual pole would not be possible. 

Exercise 8.4: Poisson's Formula. The function f(z) is analytic in \z\ < R' . 
Prove that if \a\ < R < R' , 

f{a) = 77- <f 7 ^w^ _ J (z)dz. 

2vri J\ Z \ =R (z - a){R 2 - az) 

Deduce that, for < r < R, 

i />2tt o2 _ 2 

Show that this formula solves the boundary-value problem for Laplace's equa- 
tion in the disc \z\ < R. 

Exercise 8.5: Bergman Kernel. The Hilbert space of analytic functions on a 
domain D with inner product 

(/, 9) = / fg dxdy 
Jd 

is called the Bergman 4 space of D. 

a) Suppose that cp n (z), n = 0, 1,2, . . ., are a complete set of orthonormal 
functions on the Bergman space. Show that 

oo 

K i(,z) = ^2 fm(Ofm{z)- 
m=0 

has the property that 

9(0 = jj D K((,z)g(z)dxdy. 

4 This space should not be confused with the Bargmann-Fock space of analytic functions 
on the entirety of C with inner product 

(f,9)= [ e-^ 2 fgd 2 z. 
Jc 

Stefan Bergman and Valentine Bargmann are two different people. 
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for any function g analytic in D. Thus K(£, z) plays the role of the delta 
function on the space of analytic functions on D. This object is called 
the reproducing or Bergman kernel. By taking g(z ) = ip n (z), show that 
it is the unique integral kernel with the reproducing property, 
b) Consider the case of D being the unit circle. Use the Gramm-Schmidt 
procedure to construct an orthonormal set from the functions z n , n = 
0, 1, 2, . . .. Use the result of part a) to conjecture (because we have not 
proved that the set is complete) that, for the unit circle, 



c) For any smooth, complex valued, function g defined on a domain D and 
its boundary, use Stokes' theorem to show that 



Use this to verify that this the K((, z) you constructed in part b) is 
indeed a (and hence "the") reproducing kernel, 
d) Now suppose that D is a simply connected domain whose boundary dD 
is a smooth curve. We know from the Riemann mapping theorem that 
there exists an analytic function f(z) = f(z;() that maps D onto the 
interior of the unit circle in such a way that /(C) = and /'(C) is real 
and non-zero. Show that if we set K((, z) = f'(z)f'(0/ir, then, by using 
part c) together with the residue theorem to evaluate the integral over 
the boundary, we have 



J Jd 

This K((,z) must therefore be the reproducing kernel. We see that if we 
know K we can recover the map / from 



e) Apply the formula from part d) to the unit circle, and so deduce that 



is the unique function that maps the unit circle onto itself with the point 
C mapping to the origin and with the horizontal direction through C 
remaining horizontal. 






/(*;C) = 



z-C 

l-(z 
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8.3 Applications 

We now know enough about complex variables to work through some inter- 
esting applications, including the mechanism by which an aeroplane flies. 



8.3.1 Two-dimensional vector calculus 

It is often convenient to use complex co-ordinates for vectors and tensors. In 
these co-ordinates the standard metric on M 2 becomes 

u ds 2 " = dx <S> dx + dy <8> dy 
= d~z ® dz 

= g zz dz <S> dz + g-gzdz <g> dz + g zz dz <g> dz + g^dz ® dz, (8.67) 

so the complex co-ordinate components of the metric tensor are g zz = g-~-z = 0, 
gzz = g-zz = \- The inverse metric tensor is g zz = g zz = 2, g zz = g zz = 0. 
In these co-ordinates the Laplacian is 

V 2 = ^ = 2(9A + ^). (8.68) 

When / has singularities, it is not safe to assume that d z dzf = ch>d z f ■ For 
example, from 

we deduce that 

d z d z lnz = 7iS 2 (x,y). (8.70) 
When we evaluate the derivatives in the opposite order, however, we have 

d z d z lnz = 0. (8.71) 

To understand the source of the non-commutativity, take real and imaginary 
parts of these last two equations. Write Inz = In \z\ + i9, where = argz, 
and add and subtract. We find 

V 2 ln|z| = 2ir5 2 (x,y), 
(d x d y -d y d x )6 = 2n5 2 (x,y). (8.72) 

The first of these shows that ^ln|z| is the Green function for the Laplace 
operator, and the second reveals that the vector field V# is singular, having 
a delta function "curl" at the origin. 
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If we have a vector field v with contravariant components (v x , v y ) and (nu- 
merically equal) covariant components (v x , v y ) then the covariant components 
in the complex co-ordinate system are v z = \{v x — iv y ) and v-g = \{v x + iv y ). 
This can be obtained by a using the change of co-ordinates rule, but a quicker 
route is to observe that 

v ■ dr = v x dx + Vydy = v z dz + v z dz. (8.73) 

Now 

1 1 

d z v z = -{d x v x + d y v y ) + i-{d y v x - d x v y ). (8.74) 

Thus the statement that c\v z = is equivalent to the vector field v being 
both solenoidal (incompressible) and irrotational. This can also be expressed 
in form language by setting rj = v z dz and saying that drj = means that the 
corresponding vector field is both solenoidal and irrotational. 



8.3.2 Milne-Thomson Circle Theorem 

As we mentioned earlier, we can describe an irrotational and incompressible 
fluid motion either by a velocity potential 

v x = d x (p, Vy = d y (p, (8.75) 

where v is automatically irrotational but incompressibilty requires V 2 = 0, 
or by a stream function 

v x = dyX, Vy = -d x x, (8.76) 

where v is automatically incompressible but irrotationality requires V 2 x = 0. 
We can combine these into a single complex stream function $ = + %x 
which, for an irrotational incompressible flow, satisfies the Cauchy-Riemann 
equations and is therefore an analytic function of z. We see that 

to. = f , (8.77) 

(f) and x making equal contributions. 

The Milne-Thomson theorem says that if $ is the complex stream func- 
tion for a flow in unobstructed space, then 
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is the stream function after the cylindrical obstacle \z\ = a is inserted into 
the flow. Here $(-2) denotes the analytic function defined by <&(z) = &(z). 
To see that this works, observe that a 2 / z = z on the curve \z\ = a, and so on 
this curve Im$ = % = 0. The surface of the cylinder has therefore become 
a streamline, and so the flow does not penetrate into the cylinder. If the 
original flow is created by souces and sinks exterior to \z\ = a, which will be 
singularities of $, the additional term has singularites that lie only within 
\z\ = a. These will be the "images" of the sources and sinks in the sense of 
the "method of images." 

Example: A uniform flow with speed U in the x direction has = Uz. 
Inserting a cylinder makes this 

$(z) = U (z + . (8.79) 

Because v z is the derivative of this, we see that the perturbing effect of the 
obstacle on the velocity field falls off as the square of the distance from the 
cylinder. This is a general result for obstructed flows. 




Figure 8.7: The real and imaginary parts of the function z + z~ x provide the 
velocity potentials and streamlines for irrotational incompressible How past 
a cylinder of unit radius. 

8.3.3 Blasius and Kutta-Joukowski Theorems 

We now derive the celebrated result, discovered independently by Martin 
Wilhelm Kutta (1902) and Nikolai Egorovich Joukowski (1906), that the 
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lift per unit span of an aircraft wing is equal to the product of the density 
of the air p, the circulation k = § v • dr about the wing, and the forward 
velocity U of the wing through the air. Their theory treats the air as being 
incompressible — a good approximation unless the flow-velocities approach 
the speed of sound — and assumes that the wing is long enough that the flow 
can be regarded as being two dimensional. 




Figure 8.8: Flow past an aerofoil. 

Begin by recalling how the momentum flux tensor 

Tij = pviVj + gijP (8.80) 

enters fluid mechanics. In cartesian co-ordinates, and in the presence of an 
external body force f acting on the fluid, the Euler equation of motion for 
the fluid is 

p{d t Vi + v j d jVi ) = -d t P + U (8.81) 

Here P is the pressure and we are distinguishing between co and contravariant 
components, although at the moment g^- = 8{j. We can combine Euler's 
equation with the law of mass conservation, 

d t p + d i {pv i ) = 0, (8.82) 

to obtain 

dt(pvi) + d j (pv jVi + 9ij P) = U (8.83) 

This momemtum-tracking equation shows that the external force acts as a 
source of momentum, and that for steady flow fi is equal to the divergence 
of the momentum flux tensor: 

fi = d l T u = g kl d k T u . (8.84) 
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As we are interested in steady, irrotational motion with uniform density we 
may use Bernoulli's theorem, P + \p\v \ 2 = const., to substitute — \p\v | 2 in 
place of P. (The constant will not affect the momentum flux.) With this 
substitution becomes a traceless symmetric tensor: 

Tij = p{viVj - ^gij\v\ 2 ). (8.85) 
Using v z = \{v x — ivy) and 



dx % dx j 

1' = 

- 1 zz 

together with 



T " = & 97^' (8 - 86) 



we find 



a; ^z 1 = -(* + *), y = x 2 = ^-Xz-z) (8.87) 

T = T ZZ = l -(T xx - T yy - 2iT xy ) = p(v z ) 2 . (8.88) 

This is the only component of that we will need to consider. is simply 
T, whereas T zz = = T zz because is traceless. 
In our complex co-ordinates, the equation 

fi = g k %T H (8.89) 

reads 

f z = g~ zz c\T zz + g z ~ z d z T- zz = 2c\T. (8.90) 

We see that in steady flow the net momentum flux Pj out of a region Q is 
given by 

P z = / f z dxdy=^ I f z dzdz=- [ c\Tdzdz = - I Tdz. (8.91) 

We have used Stokes' theorem at the last step. In regions where there is no 
external force, T is analytic, c\T = 0, and the integral will be independent 
of the choice of contour dfl. We can subsititute T = pv 2 z to get 

P z = -i p J) v 2 z dz. (8.92) 
Jan 
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To apply this result to our aerofoil we take can take dQ to be its boundary. 
Then P z is the total force exerted on the fluid by the wing, and, by Newton's 
third law, this is minus the force exerted by the fluid on the wing. The total 
force on the aerofoil is therefore 



ip I v\ dz. (8.93) 
Jan 



F z = ip 

The result (8.93) is often called Blasius' theorem. 

Evaluating the integral in (8.93) is not immediately possible because the 
velocity v on the boundary will be a complicated function of the shape of 
the body. We can, however, exploit the contour independence of the integral 
and evaluate it over a path encircling the aerofoil at large distance where the 
flow field takes the asymptotic form 

"■ = I '- + SfI + (?)- (8 ' 94) 

The 0(1/ z 2 ) term is the velocity perturbation due to the air having to flow 
round the wing, as with the cylinder in a free flow. To confirm that this flow 
has the correct circulation we compute 

j) v ■ dr = j) v z dz + j) v z dz = k. (8.95) 

Substituting v z in (8.93) we find that the 0(1/ z 2 ) term cannot contribute as 
it cannot affect the residue of any pole. The only part that does contribute 
is the cross term that arises from multiplying U z by K/(Amz). This gives 

F z = ip (^j = i P kU z (8.96) 

so that 

^(F x - iF y ) = i P K-(U x - iU y ). (8.97) 
Thus, in conventional co-ordinates, the reaction force on the body is 

F x = pKUy, 

F y = - P kU x . (8.98) 

The fluid therefore provides a lift force proportional to the product of the 
circulation with the asymptotic velocity. The force is at right angles to the 
incident airstream, so there is no drag. 
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The circulation around the wing is determined by the Kutta condition 
that the velocity of the flow at the sharp trailing edge of the wing be finite. 
If the wing starts moving into the air and the requisite circulation is not 
yet established then the flow under the wing does not leave the trailing edge 
smoothly but tries to whip round to the topside. The velocity gradients 
become very large and viscous forces become important and prevent the air 
from making the sharp turn. Instead, a starting vortex is shed from the 
trailing edge. Kelvin's theorem on the conservation of vorticity shows that 
this causes a circulation of equal and opposite strength to be induced about 
the wing. 

For finite wings, the path independence of § v • dr means that the wings 
must leave a pair of trailing wingtip vortices of strength k that connect back 
to the starting vortex to form a closed loop. The velocity field induced by the 
trailing vortices cause the airstream incident on the aerofoil to come from a 
slighly different direction than the asymptotic flow. Consequently, the lift is 
not quite perpendicular to the motion of the wing. For finite-length wings, 
therefore, lift comes at the expense of an inevitable induced drag force. The 
work that has to be done against this drag force in driving the wing forwards 
provides the kinetic energy in the trailing vortices. 



8.4 Applications of Cauchy's Theorem 

Cauchy's theorem provides the Royal Road to complex analysis. It is possible 
to develop the theory without it, but the path is harder going. 



8.4.1 Cauchy's Integral Formula 

If f(z) is analytic within and on the boundary of a simply connected domain 
Q, with dQ = T, and if ( is a point in Q, then, noting that the the integrand 
has a simple pole at z = ( and applying the residue formula, we have Cauchy 's 
integral formula 




(8.99) 
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Figure 8.9: Cauchy contour. 

This formula holds only if ( lies within Q. If it lies outside, then the integrand 
is analytic everywhere inside Q, and so the integral gives zero. 

We may show that it is legitimate to differentiate under the integral sign 
in Cauchy's formula. If we do so n times, we have the useful corollary that 

/W( C ) = ^Ll Qf) dz. (8.100) 
J vs; 2ttz J T (z - C)" +1 

This shows that being once differentiable (analytic) in a region automatically 
implies that f(z) is differentiable arbitrarily many timesl 

Exercise 8.6: The generalized Cauchy formula. Suppose that we have solved a 
D-bar problem (see exercise 8.3), and so found an F(z,~z) with c\F = f(z,~z) 
in a region 0. Compute the exterior derivative of 

F(z,z) 

using (8.56). Now, manipulating formally with delta functions, apply Stokes' 
theorem to show that, for (£, Q in the interior of we have 

F[( ,o = ±i Z&<,-L [ 0*2*+. 

2m Jan z ~ C 71" Jn z ~ C 

This is called the generalized Cauchy formula. Note that the first term on the 
right, unlike the second, is a function only of C an d so is analytic. 

Liouville's Theorem 

A dramatic corollary of Cauchy's integral formula is provided by 
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Liouville's theorem: If f(z) is analytic in all of C, and is bounded there, 
meaning that there is a positive real number K such that \f(z)\ < K, then 
f(z) is a constant. 

This result provides a powerful strategy for proving that two formulae, 
fi(z) and f2(z), represent the same analytic function. If we can show that 
the difference fx — f 2 is analytic and tends to zero at infinity then Liouville's 
theorem tells us that f\ = f 2 . 

Because the result is perhaps unintuitive, and because the methods are 
typical, we will spell out in detail how Liouville's theorem works. We select 
any two points, Z\ and z 2 , and use Cauchy's formula to write 

/w - /w = ^ £ - j^) m 

We take the contour V to be circle of radius p centered on z\. We make 
p > 2\z\ — 22 1, so that when z is on V we are sure that \z — z%\ > p/2. 



z 




Figure 8.10: Contour for Liouville' theorem. 



Then, using | J f(z)dz\ < J \f(z)\\dz\, we have 

1 



< 



2tt 
1 

2^ 



(zi - z 2 ) 



r (z - z x ){z - z 2 

2tt 



-f(z) dz 



Zt - z 2 \K 



de 



2\zt - z 2 \K 



(8.102) 



'o P/2 P 

The right hand side can be made arbitrarily small by taking p large enough, 
so we we must have f{z\) = f(z 2 ). As Z\ and z 2 were any pair of points, we 
deduce that f(z) takes the same value everywhere. 
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8.4.2 Taylor and Laurent Series 

We have defined a function to be analytic in a domain D if it is (once) 
complex differentiable at all points in D. It turned out that this apparently 
mild requirement automatically implied that the function is differentiable 
arbitrarily many times in D. In this section we shall see that knowledge 
of all derivatives of f(z) at any single point in D is enough to completely 
determine the function at any other point in D. Compare this with functions 
of a real variable, for which it is easy to construct examples that are once 
but not twice differentiable, and where complete knowledge of function at a 
point, or in even in a neighbourhood of a point, tells us absolutely nothing 
of the behaviour of the function away from the point or neighbourhood. 

The key ingredient in these almost magical properties of complex ana- 
lytic functions is that any analytic function has a Taylor series expansion 
that actually converges to the function. Indeed an alternative definition of 
analyticity is that f(z) be representable by a convergent power series. For 
real variables this is the definition of a real analytic function. 

To appreciate the utility of power series representations we do need to 
discuss some basic properties of power series. Most of these results are ex- 
tensions to the complex plane of what we hope are familiar notions from real 
analysis. 

Consider the power series 

oo 

y^a n (z-z ) n = lim S N , (8.103) 

n=0 

where Sn are the partial sums 

N 

S N = J2 a n(z-z ) n . (8.104) 

n=0 

Suppose that this limit exists (i.e the series is convergent) for some z — C; 
then it turns out that the series is absolutely convergent 5 for any \z — z \ < 
\C-Zol 

5 Rccall that absolute convergence of a « means that X)l a «l converges. Absolute 
convergence implies convergence, and also allows us to rearrange the order of terms in the 
series without changing the value of the sum. Compare this with conditional convergence, 
where 8 « converges, but ^ \c n \ does not. You may remember that Riemann showed 
that the terms of a conditionally convergent series can be rearranged so as to get any 
answer whatsoever*. 



322 



CHAPTER 8. COMPLEX ANALYSIS I 



To establish this absolute convergence we may assume, without loss of 
generality, that zq = 0. Then, convergence of the sum ^ o, n ( n requires that 
\ a n( n \ — *■ 0) an d thus |a„C n | is bounded. In other words, there is a B such 
that |a„C n | < B for any n. We now write 



a„z 



WnCl 



c 



< B 



(8.105) 



The sum ^2 \a n z n \ therefore converges for \z/C\ < 1, by comparison with a 
geometric progression. 

This result, that if a power series in (z — z ) converges at a point then 
it converges at all points closer to zq, shows that a power series possesses 
some radius of convergence R. The series converges for all \z — zo\ < R, and 
diverges for all \z — z \ > R. (What happens on the circle \z — z \ = R is 
usually delicate, and harder to establish.) We soon show that the radius of 
convergence of a power series is the distance from z to the nearest singularity 
of the function that it represents. 

By comparison with a geometric progression, we may establish the fol- 
lowing useful formulae giving R for the series ^2a n z n : 



R 



lim 



n-»oo \a n \ 

= lim |a„| 1/n . 



(8.106) 



The proof of these formula? is identical the real-variable version. 

When we differentiate the terms in a power series, and thus take a n z n — > 
na n z n_1 , this does not alter R. This observation suggests that it is legitimate 
to evaluate the derivative of the function represented by the powers series by 
differentiating term-by-term. As step on the way to justifying this, observe 
that if the series converges at z — £ and D r is the domain \z\ < r < |£| then, 
using the same bound as in the proof of absolute convergence, we have 



\a„z n \ < B 



< B- 



icr ici 



M„ 



(8.107) 



where ^M n is convergent. As a consequence J2a n z n is uniformly con- 
vergent in D r by the Weierstrass "M" test. You probably know that uni- 
form convergence allows the interchange the order of sums and integrals: 
f(52fn{x))dx = f fn{x)dx. For real variables, uniform convergence is 
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not a strong enough a condition for us to to safely interchange order of sums 
and derivatives: (Yl f n (x))' is not necessarily equal to ^ f' n { x )- For complex 
analytic functions, however, Cauchy's integral formula reduces the operation 
of differentiation to that of integration, and so this interchange is permitted. 
In particular we have that if 

oo 

f(z) = J2^z n , (8.108) 

n=0 

and R is defined by R = \(\ for any ( for which the series converges, then 
f(z) is analytic in \z\ < R and 

oo 

f'{z)=Y,na n z n -\ (8.109) 

n=0 

is also analytic in \z\ < R. 
Morera's Theorem 

There is is a partial converse of Cauchy's theorem: 

Theorem (Morera): If f(z) is defined and continuous in a domain D, and 
if <f r f(z)dz = for all closed contours, then f(z) is analytic in D. To 
prove this we set F(z) = fp f(() d(. The integral is path-independent by the 
hypothesis of the theorem, and because f(z) is continuous we can differentiate 
with respect to the integration limit to find that F'(z) = f(z). Thus F(z) 
is complex different iable, and so analytic. Then, by Cauchy's formula for 
higher derivatives, F"(z) = f'(z) exists, and so f(z) itself is analytic. 

A corollary of Morera's theorem is that if f n {z) — > f(z) uniformly in D, 
with all the /„ analytic, then 

i) f(z) is analytic in D, and 

ii) fn( z ) -»• f'( z ) uniformly. 

We use Morera's theorem to prove (i) (appealing to the uniform conver- 
gence to justify the interchange the order of summation and integration), 
and use Cauchy's theorem to prove (ii). 

Taylor's Theorem for analytic functions 

Theorem: Let T be a circle of radius p centered on the point a. Suppose that 
f(z) is analytic within and on T, and and that the point z = ( is within T. 
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Then f(() can be expanded as a Taylor series 

/(C) = /(a) + £ ^^> } (a), (8.110) 



71=1 



meaning that this series converges to /(£) for all ( such that \( — a\ < p. 
To prove this theorem we use identity 

1 - 1 + + (8-111) 



z-C z-a (z-a) 2 (z-a) N (z - a) N z - ( 

and Cauchy's integral, to write 

'«> ■ 

_ (C - < I /(*) (C-a) N I M 

~ ^ 2m f (z-a)"+i + 2vn J (z - a) N (z - () 

N - 1 (/- _ \n 



n=0 



where 



R N d = f / 7 (i-^ (fa. (8.113) 

2m J r (z-a) N (z-() V ; 

This is Taylor's theorem with remainder. For real variables this is as far as 
we can go. Even if a real function is differentiable infinitely many times, 
there is no reason for the remainder to become small. For analytic functions, 
however, we can show that R N — > as N — > oo. This means that the 
complex- variable Taylor series is convergent, and its limit is actually equal 
to f(z). To show that Rn — > 0, recall that T is a circle of radius p centered 
on z = a. Let r = \( — a\ < p, and let M be an upper bound for f(z) on T. 
(This exists because / is continuous and T is a compact subset of C.) Then, 
estimating the integral using methods similar to those invoked in our proof 
of Liouville's Theorem, we find that 

R N < r — ( 1* PM ^ ] . (8.114) 
2tt \pN( p - r )) ^ > 

As r < p, this tends to zero as N — > oo. 
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We can take p as large as we like provided there are no singularities of 
/ end up within, or on, the circle. This confirms the claim made earlier: 
the radius of convergence of the powers series representation of an analytic 
functionis the distance to the nearest singularity. 

Laurent Series 

Theorem (Laurent): Let I\ and T 2 be two anticlockwise circlular paths with 
centre a, radii p\ and pi, and with p2 < p\. If f(z) is analytic on the circles 
and within the annulus between them, then, for £ in the annulus: 

oo oo 

/(C) = E a n (( -a) n + J2 U( ~ a)~ n . (8.115) 

n=0 n=l 




Figure 8.11: Contours for Laurent's theorem. 
The coefficients a n and h n are given by 

"" = ^£(7^^ *-=dbjE, /«<*-•>-'*■ (8 - 116) 

Laurent's theorem is proved by observing that 

and using the identities 

1 1 ,(C-a), , (C-a)^ 1 , (C-a) N 1 (RUR] 



z-C z-a (z-o) 2 (z-a) N (z - a) N z - (' 
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and 



(z-a) N - 1 (z-a) N 



z-C C-a + (C-a) 2+ "' + (C-a)" + (C-a)"(-z- (8 ' 119) 

Once again we can show that the remainder terms tend to zero. 
Warning: Although the coefficients a n are given by the same integrals as in 
Taylor's theorem, they are not interpretable as derivatives of / unless f(z) 
is analytic within the inner circle, in which case all the b n are zero. 



8.4.3 Zeros and Singularities 

This section is something of a nosology — a classification of diseases — but 
you should study it carefully as there is some tight reasoning here, and the 
conclusions are the essential foundations for the rest of subject. 
First a review and some definitions: 

a) If f(z) is analytic with a domain D, we have seen that / may be 
expanded in a Taylor series about any point zq G D: 

oo 

f(z) = J2^n(z-z ) n . (8.120) 

n=0 

If ao = a± — ■ ■ ■ — a n -i = 0, and a n ^ 0, so that the first non-zero 
term in the series is a n (z — zo) n , we say that f(z) has a zero of order n 
at zq. 

b) A singularity of f(z) is a point at which f(z) ceases to be different iable. 
If f(z) has no singularities at finite z (for example, f(z) = sin z) then 
it is said to be an entire function. 

c) If f(z) is analytic in D except at z — a, an isolated singularity , then 
we may draw two concentric circles of centre a, both within D, and in 
the annulus between them we have the Laurent expansion 

oo oo 

f(z) =J2 a n(z-a) n + J2 b n (z - a)- n . (8.121) 

n=0 n=l 

The second term, consisting of negative powers, is called the principal 
part of f(z) at z — a. It may happen that b m ^ but b n = 0, n > m. 
Such a singularity is called a pole of order m at z = a. The coefficient 
bi, which may be 0, is called the residue of / at the pole z — a. If the 
series of negative powers does not terminate, the singularity is called 
an isolated essential singularity 
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Now some observations: 

i) Suppose f(z) is analytic in a domain D containing the point z — a. 
Then we can expand: f(z) = ^2a n (z — a) n . If f(z) is zero at z — 0, 
then there are exactly two possibilities: a) all the a n vanish, and then 
f(z) is identically zero; b) there is a first non-zero coefficient, a m say, 
and so f(z) = z m (p(z), where (p(a) ^ 0. In the second case / is said to 
possess a zero of order m at z = a. 

ii) If z = a is a zero of order m, of f(z) then the zero is isolated - i.e. 
there is a neighbourhood of a which contains no other zero. To see this 
observe that f(z) = (z — a) m ip(z) where (p(z) is analytic and (p(a) ^ 0. 
Analyticity implies continuity, and by continuity there is a neighbour- 
hood of a in which ifi(z) does not vanish. 

iii) Limit points of zeros I: Suppose that we know that f(z) is analytic in D 
and we know that it vanishes at a sequence of points oi, a 2 , a 3 , . . . G D. 
If these points have a limit point 6 that is interior to D then f(z) must, 
by continuity, be zero there. But this would be a non-isolated zero, in 
contradiction to item ii), unless f(z) actually vanishes identically in D. 
This, then, is the only option. 

iv) From the definition of poles, they too are isolated. 

v) If f(z) has a pole at z = a then f(z) — > oo as z — > a in any manner. 

vi) Limit points of zeros II: Suppose we know that / is analytic in D, 
except possibly at z = a which is limit point of zeros as in iii), but we 
also know that / is not identically zero. Then z = a must be singularity 
of / — but not a pole ( because / would tend to infinity and could 
not have arbitrarily close zeros) — so a must be an isolated essential 
singularity. For example sin \ j z has an isolated essential singularity at 
z — 0, this being a limit point of the zeros at z — 1/mr. 

vii) A limit point of poles or other singularities would be a non-isolated 
essential singularity. 

8.4.4 Analytic Continuation 

Suppose that j\(z) is analytic in the (open, arcwise-connected) domain Di, 
and f2(z) is analytic in D 2 , with D\ fl D 2 ^ 0. Suppose further that fi(z) = 
f2(z) in D 1 fl D 2 . Then we say that / 2 is an analytic continuation of f\ to 

6 A point z is a limit point of a set S if for every e > there is some a e S, other than 
z itself, such that \a — z \ < e. A sequence need not have a limit for it to possess one or 
more limit points. 
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D 2 . Such analytic continuations are unique: if is also analytic in D 2 , and 
fs = fi in D\ fl D 2 , then f 2 — fz, = in Di R D 2 . Because the intersection 
of two open sets is also open, f\ — f 2 vanishes on an open set and, so by 
observation iii) of the previous section, it vanishes everywhere in D 2 . 




Figure 8.12: Intersecting domains. 

We can use this uniqueness result, coupled with the circular domains of 
convergence of the Taylor series, to extend the definition of analytic functions 
beyond the domain of their initial definition. 



The distribution 1 

An interesting and useful example of analytic continuation is provided by the 
distribution i" -1 , which, for real positive a, is defined by its evaluation on 
a test function <p(x) as 

POO 

„cx— 1 , ~\ I 1 



x a + -\ v )= x a -'ip(x)dx. (8.122) 







The pairing (x" _1 , y?) extends to an complex analytic function of a provided 
the integral converges. Test functions are required to decrease at infinity 
faster than any power of x, and so the integral always converges at the upper 
limit. It will converge at the lower limit provided Re (a) > 0. Assume that 
this is so, and integrate by parts using 

4- (—<p(x)) = /'V(^) + —<p'{x)- (8-123) 
dx \ a J a 



We find that, for e > 0, 



x a 



—<p(x) 
a 



/oo r-oo at 

x a ~ 1 <p(x)dx + J —ip\x)dx. (8.124) 
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The integrated-out part on the left-hand-side of (8.124) tends to zero as 
we take e to zero, and both of the integrals converge in this limit as well. 
Consequently 

1 r°° 

h(a) = / x a ip'(x)dx (8.125) 



is equal to (x^T 1 ,^) for < Re (a) < oo. However, the integral defining 
h(a) converges in the larger region — 1 < Re (a) < oo. It therefore provides 
an analytic continuation to this larger domain. The factor of 1/a reveals that 
the analytically-continued function possesses a pole at a — 0, with residue 

poo 

- I (p'(x)dx = y?(0). (8.126) 
Jo 

We can repeat the integration by parts, and find that 

1 r°° 

I 2 ( a ) = ——— x a+1 tp"(x)dx (8.127) 
a(a + 1) J 

provides an analytic continuation to the region —2 < Re (a) < oo. By 
proceeding in this manner, we can continue (x" -1 ,^) to a function analytic 
in the entire complex a plane with the exception of zero and the negative 
integers, at which it has simple poles. The residue of the pole at a = —n is 

¥>W(0)/ra!. 

There is another, much more revealing, way of expressing these analytic 
continuations. To obtain this, suppose that <fi G C°°[0,oo] and — > at 
infinity as least as fast as 1/x. (Our test function (p decreases much more 
rapidly than this, but 1/x is all we need for what follows.) Now the function 

poo 

1(a) = / x a - l ct)(x) dx (8.128) 
Jo 

is convergent and analytic in the strip < Re (a) < 1. By the same reasoning 
as above, 1(a) is there equal to 

00 x a 

—<j/(x) dx. (8.129) 
a 



Again this new integral provides an analytic continuation to the larger strip 
— 1 < Re (a) < 1. But in the left-hand half of this strip, where —1 < 
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Re (a) < 0, we can write 

° x a 
— 4>{x) dx 



a 



lim 



lim 



lim 

e->0 



x Q_ V(aO dx 



a 



x a - L (f)(x) dx + <p(e) — 
a 



x 



a- It 



(x) - 0(e)] dx \ , 



-F 

Jo 



X 



a-lr 



(x) - 0(0)] dx. 



(8.130) 



Observe how the integrated out part, which tends to zero in < Re (a) < 1, 
becomes divergent in the strip —1 < Re (a) < 0. This divergence is there 
craftily combined with the integral to cancel its divergence, leaving a finite 
remainder. As a consequence, for —1 < Re (a) < 0, the analytic continuation 
is given by 



poo 

1(a) = / x a-1 [0(a:) -0(0)} dx. 
Jo 



(8.131) 



Next we observe that x( x ) — [^{ x ) — 4>(fy]l x tends to zero as 1/x for 
large x, and at x = can be defined by its limit as x(0) = 0'(O). This x( x ) 
then satisfies the same hypotheses as <fi(x). With 1(a) denoting the analytic 
continuation of the original J, we therefore have 



Ka) 



poo 

/ x a ~ l [<P(x) - <f>(0)] dx, 
Jo 

<P(x) - 0(0) 



X 



-1 < Re (a) < 
dx, where (5 — a + 1, 



- [ 

Jo 

- f 

Jo 



X 



13-1 



<f>(x) - 0(0) 



X 



- m 



dx, -1 < Re (J3) < 



x a - l [(j)(x) - 0(0) - x(j)'(0)} dx, -2 < Re (a) < -1, 



(8.132) 



the arrow denoting the same analytic continuation process that we used with 



We can now apply this machinary to our original <p(x), and so deduce 
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that the analytically-continued distribution is given by 



(x 



a-1 



, ( P) 



X 



a-1 



<p(x) dx, 



x a - l [if(x) - <p(0)]dx, 



< Re (a) < oo, 
-1 < Re (a) < 0, 



poo 

/ x a - x [(p(x) - <p(0) - xip'(0)} dx, -2 < Re (a) < -1, 
Jo 

(8.133) 

and so on. The analytic continuation automatically subtracts more and more 
terms of the Taylor series of <p(x) the deeper we penetrate into the left-hand 
half-plane. This property, that analytic continuation covertly subtracts the 
minimal number of Taylor-series terms required ensure convergence, lies be- 
hind a number of physics applications, most notably the method of dimen- 
sional regularization in quantum field theory. 

The following exercise illustrates some standard techniques of reasoning 
via analytic continuation. 

Exercise 8. 7: Define the dilogarithm function by the series 

Li 2 (*) = Yf + ^2 +^2 +•••• 

The radius of convergence of this series is unity, but the domain of Li2(z) can 
be extended to \z\ > 1 by analytic continuation. 

a) Observe that the series converges at z = ±1, and at z = 1 is 



Li 2 (l) 

Rearrange the series to show that 



1 1 



~6 



TT 



Li 2 (-1) = . 

1 12 

b) Identify the derivative of the power series for \j\2{z) with that of an 
elementary function. Exploit your identification to extend the definition 
of [Li2(z)]' outside \z\ < 1. Use the properties of this derivative function, 
together with part a), to prove that 

Li 2 (— z) + Li 2 



--llnz) . 

2 V 1 6 



This formula allows us to calculate values of the dilogarithm for \z\ > 1 
in terms of those with \z\ < 1. 
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Many weird identities involving dilogarithms exist. Some, such as 

Li2 B) + ? Li2 G) - - ^ + in2in3 - \ {ia2f - 5 (iii3)2 - 

were found by Ramanujan. Others, originally discovered by sophisticated 
numerical methods, have been given proofs based on techniques from quantum 
mechanics. Poly logarithms, defined by 

occur frequently when evaluating Feynman diagrams. 



8.4.5 Removable Singularities and the Weierstrass-Casorati 
Theorem 

Sometimes we are given a definition that makes a function analytic in a 
region with the exception of a single point. Can we extend the definition to 
make the function analytic in the entire region? Provided that the function 
is well enough behaved near the point, the answer is yes, and the extension 
is unique. Curiously, the proof that this is so gives us insight into the wild 
behaviour of functions near essential singularities. 



Removable singularities 

Suppose that f(z) is analytic in D\a, but that \im z ^ a (z — a)f(z) = 0, then / 
may be extended to a function analytic in all of D — i.e. z = a is a removable 
singularity . To see this, let ( lie between two simple closed contours F 1 and 
r 2 , with a within the smaller, T 2 . We use Cauchy to write 

/(C) = —l^dz-^lM-dz. (8.134) 

Now we can shrink T 2 down to be very close to a, and because of the condition 
on f(z) near z = a, we see that the second integral vanishes. We can also 
arrange for T 1 to enclose any chosen point in D. Thus, if we set 

/ <fl = ds£^c' b (8 - 135) 

within Ti, we see that / = / in D\a, and is analytic in all of D. The extension 
is unique because any two analytic functions that agree everywhere except 
for a single point, must also agree at that point. 
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Weierstrass-Casorati 

We apply the idea of removable singularities to show just how pathological 
a beast is an isolated essential singularity: 

Theorem (Weierstrass-Casorati) : Let z = a be an isolated essential singular- 
ity off(z), then in any neighbourhood of a the function f(z) comes arbitrarily 
close to any assigned valued in C. 

To prove this, define N s (d) — {z e C : \z - a\ < 5}, and N e (() = {z e 
C : |-2 — CI < e }- The claim is then that there is an z G N s (a) such that 
f(z) G N e ((). Suppose that the claim is not true. Then we have \f(z)—(\ > e 
for all z G N s (a). Therefore 

< i (8.136) 



/(*) - c 

in Ns(a), while l/(f(z) — () is analytic in Ng(a) \ a. Therefore z = a is a 
removable singularity of l/(f(z) — (), and there is an an analytic g(z) which 
coincides with l/(f(z) — () at all points except a. Therefore 

/M = C + ^, (8-137) 

except at a. Now g(z), being analytic, may have a zero at z = a giving a 
pole in /, but it cannot give rise to an essential singularity. The claim is 
true, therefore. 



Picard's Theorems 

Weierstrass-Casorati is elementary. There are much stronger results: 
Theorem (Picard's little theorem): Every nonconstant entire function attains 
every complex value with at most one exception. 

Theorem (Picard's big theorem): In any neighbourhood of an isolated essen- 
tial singularity, f(z) takes every complex value with at most one exception. 
The proofs of these theorems are hard. 

As an illustration of Picard's little theorem, observe that the function 
expz is entire, and takes all values except 0. For the big theorem observe 
that function f(z) = exp(l/z). has an essential singularity at z — 0, and 
takes all values, with the exception of 0, in any neighbourhood of z — 0. 
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8.5 Meromorphic functions and the Winding- 
Number 

A function whose only singularities in D are poles is said to be meromor- 
phic there. These functions have a number of properties that are essentially 
topological in character. 

8.5.1 Principle of the Argument 

If f(z) is meromorphic in D with 3D = T, and f(z) ^ on T, then 

7T- i f -rrl d z = N - p (8-138) 
2m J T f(z) y 1 

where N is the number of zero's in D and P is the number of poles. To show 
this, we note that if f(z) — (z — a) m (p(z) where if is analytic and non-zero 
near a, then 

f'(z) m (p'iz) , n , „„ x 

-jfi = + 8.139 

f(z) z-a <p(z) 

so /'/ / has a simple pole at a with residue m. Here m can be either positive 
or negative. The term ip'(z)/ip(z) is analytic at z — a, so collecting all the 
residues from each zero or pole gives the result. 
Since /'// = jMn / the integral may be written 

£^ dz = Ar \ n = , Ar arg f(z), (8.140) 

the symbol Ar denoting the total change in the quantity after we traverse T. 
Thus 

N-P = ^-A r argf(z). (8.141) 
This result is known as the principle of the argument. 



Local mapping theorem 

Suppose the function w = f(z) maps a region Q holomorphicly onto a region 
Q', and a simple closed curve 7 C f2 onto another closed curve T C Q', which 
will in general have self intersections. Given a point a e CI', we can ask 
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ourselves how many points within the simple closed curve 7 map to a. The 
answer is given by the winding number of the image curve T about a. 











[ f 





Figure 8.13: An analytic map is one-to-one where the winding number is 
unity, but two-to-one at points where the image curve winds twice. 

To that this is so, we appeal to the principal of the argument as 

1 r f( z ) 

# of zeros of (/ — a) within 7 = (p —— dz, 

2m J 



1 

2ni 
n(r, a) 



7 f( z ) ~ a 
dw 



w — a 



(8.142) 



where n(T, a) is called the winding number of the image curve T about a. It 
is equal to 



n(T, a) 



2tt 



A 7 arg (w — a) 



(8.143) 



and is the number of times the image point w encircles a as z traverses the 
original curve 7. 

Since the number of pre-image points cannot be negative, these winding 
numbers must be positive. This means that the holomorphic image of curve 
winding in the anticlockwise direction is also a curve winding anticlockwise. 

For mathematicians, another important consequence of this result is that 
a holomorphic map is open- i.e. the holomorphic image of an open set is 
itself an open set. The local mapping theorem is therefore sometime called 
the open mapping theorem. 



8.5.2 Rouche's theorem 



Here we provide an effective tool for locating zeros of functions. 
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Theorem (Rouche): Let f(z) and g(z) be analytic within and on a simple 
closed contour 7. Suppose further that \g(z)\ < \ f(z)\ everywhere on 7, then 
f(z) and f(z) + g(z) have the same number of zeros within 7. 

Before giving the proof, we illustrate Rouche's theorem by giving its most 
important corollary: the algebraic completeness of the complex numbers, a 
result otherwise known as the fundamental theorem of algebra. This asserts 
that, if R is sufficiently large, a polynomial P(z) = a n z n + a n _iz n ~ l + • • - + ao 
has exactly n zeros, when counted with their multiplicity, lying within the 
circle \z\ = R. To prove this note that we can take R sufficiently big that 

\a n z n \ = \a n \R n 

> \a n -i\R n 1 + \a n -2\R n 2 • • • + \clq\ 

> \a n _ a z n - x + a n _ 2 z n - 2 ■■■ + oo|, (8.144) 



on the circle \z\ = R. We can therefore take f(z) = a n z n and g(z) = 
a n _ a z n ~ x + a„_2-2™~ 2 • • • + a in Rouche. Since a n z n has exactly n zeros, all 
lying at z — 0, within \z\ = R, we conclude that so does P(z). 

The proof of Rouche is a corollary of the principle of the argument. We 
observe that 



# of zeros of / + g = n(T, 0) 

= 7^ A 7 arg (/ + 

= ^argZ + ^A^argfl+V/)- (8.145) 

Now \g/f\ < 1 on 7, so 1 + gj f cannot circle the origin as we traverse 7. 
As a consequence A 7 arg (1 + g/f) — 0. Thus the number of zeros of / + g 
inside 7 is the same as that of / alone. (Naturally, they are not usually in 
the same places.) 

The geometric part of this argument is often illustrated by a dog on a 
lead. If the lead has length L, and the dog's owner stays a distance R > L 
away from a lamp post, then the dog cannot run round the lamp post unless 
the owner does the same. 
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Figure 8.14: The curve T is the image of 7 under the map f + g. If\g\ < \f\, 
then, as z traverses 7, f+g winds about the origin the same number of times 
that f does. 



Exercise 8.8: Jacobi Theta Function. The function 9{z\t) is denned for Imr > 
by the sum 

00 

0(z\t) = e i7TTn2 e 27Tinz . 

n=—oo 

Show that 6{z+\\t) = 9(z\r), and <9(z+t|t) = e - i7rT - 27riz 6(z\T). Use this infor- 
mation and the principle of the argument to show that 9{z\t) has exactly one 
zero in each unit cell of the Bravais lattice comprising the points z = m + rer; 
m, n £ Z. Show that these zeros are located at z = (m + 1/2) + (n + l/2)r. 

Exercise 8.9: Use Rouche's theorem to find the number of roots of the equation 
z 5 + 15z + 1 = lying within the circles, i) \z\ = 2, ii) \z\ = 3/2. 



8.6 Analytic Functions and Topology 
8.6.1 The Point at Infinity 

Some functions, f(z) = 1/z for example, tend to a fixed limit (here 0) as z 
become large, independently of in which direction we set off towards infinity. 
Others, such as f(z) = expz, behave quite differently depending on what 
direction we take as \z\ becomes large. 

To accommodate the former type of function, and to be able to legiti- 
mately write /(oo) = for f(z) = 1/z, it is convenient to add "00" to the 
set of complex numbers. Technically, what we are doing is to constructing 
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the one-point compactification of the locally compact space C. We often 
portray this extended complex plane as a sphere S 2 (the Riemann sphere), 
using stereographic projection to locate infinity at the north pole, and at 
the south pole. 



N 




Figure 8.15: Stereographic mapping of the complex plane to the 2-Sphere. 

By the phrase a neighbourhood of z, we mean an open set containing z. We 
use the stereographic map to define a neighbourhood of infinity as the stere- 
ographic image of a neighbourhood of the north pole. With this definition, 
the extended complex plane C U {oo} becomes topologically a sphere, and in 
particular, becomes a compact set. 

If we wish to study the behaviour of a function "at infinity," we use the 
map z i — ► — 1/z to bring oo to the origin, and study the behaviour of the 
function there. Thus the polynomial 

f(z) =a + a 1 z + --- + a N z N (8.146) 

becomes 

/(C) =a + a.C 1 + ■■■ + a N (~ N , (8.147) 

and so has a pole of order N at infinity. Similarly, the function f(z) = z~ 3 has 
a zero of order three at infinity, and sin z has an isolated essential singularity 
there. 

We must be a careful about defining residues at infinity. The residue is 
more a property of the 1-form f(z) dz than of the function f(z) alone, and 
to find the residue we need to transform the dz as well as f(z). For example, 
if we set z = l/( in dz/z we have 
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so the 1-form (1/z) dz has a pole at z = with residue 1, and has a pole 
with residue —1 at infinity — even though the function 1/z has no pole there. 
This 1-form viewpoint is required for compatability with the residue theorem: 
The integral of 1/z around the positively oriented unit circle is simultane- 
ously minus the integral of 1/z about the oppositely oriented unit circle, now 
regarded as a a positively oriented circle enclosing the point at infinity. Thus 
if f(z) has of pole of order N at infinity and 

f(z) = h a_ 2 2~ 2 + gli-sT 1 + a + a x z + a 2 z 2 H h A N z N 

= ■■■ + a_ 2 C 2 + a_iC + a + a^ 1 + a 2 (- 2 + ■■■ + A N (~ N 

(8.149) 

near infinity, then the residue at infinity must be defined to be — a_i, and 
not ai as one might naively have thought. 

Once we have allowed oo as a point in the set we map from, it is only 
natural to add it to the set we map to — in other words to allow oo as a 
possible value for f(z). We will set f(a) = oo, if \ f{z) \ becomes unboundedly 
large as z — > a in any manner. Thus, if f(z) = 1/z we have /(0) = oo. 

The map 

w=( ^^-) ( Z -1^2\ (8.150) 
\z - z^J \z 1 - z J 

takes 



(8.151) 

for example. Using this language, the Mobius maps 

w = ?Z±± (8.152) 
cz + d 

become one-to-one maps of S 2 — > S 2 . They are the only such globally con- 
formal one-to-one maps. When the matrix 

a b 



Zq - 


- o, 


zi - 




Zoo ~ 


-> oo 



is an element of SU(2), the resulting one-to-one map is a rigid rotation of 
the Riemann sphere. Stereographic projection is thus revealed to be the 
geometric origin of the spinor representations of the rotation group. 
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If an analytic function f(z) has no essential singularities anywhere on 
the Riemann sphere then / is rational, meaning that it can be written as 
f(z) = P(z)/Q(z) for some polynomials P, Q. 

We begin the proof of this fact by observing that f(z) can have only a 
finite number of poles. If, to the contrary, / had an infinite number of poles 
then the compactness of S 2 would ensure that the poles would have a limit 
point somewhere. This would be a non-isolated singularity of /, and hence 
an essential singularity. Now suppose we have poles at zi, Z2, • • •, zjv with 
principal parts 

ran 7 

If one of the z n is oo, we first use a Mobius map to move it to some finite 
point. Then 

N m„ , 

^) = /W-EE(I^7w (8.154) 

n=l m=l ^ Zn > 

is everywhere analytic, and therefore continuous, on S 2 . But S 2 being com- 
pact and F(z) being continuous implies that F is bounded. Therefore, by 
Liouville's theorem, it is a constant. Thus 

N m„ , 
n=l rn=l ^ n > 

and this is a rational function. If we made use of a Mobius map to move 
a pole at infinity, we use the inverse map to restore the original variables. 
This manoeuvre does not affect the claimed result because Mobius maps take 
rational functions to rational functions. 

The map z i— > f(z) given by the rational function 

tl \ _ P{z) _ a nZ n + a n ^iz n 1 + • • • a 

f{z) ~W)~ b n ^ + b n - 1 ^ + -b (8 - 156) 

wraps the Riemann sphere n times around the target S 2 . In other words, it 
is a n-to-one map. 



8.6.2 Logarithms and Branch Cuts 



The function y = \nz is defined to be the solution to z — expy. Unfortu- 
nately, since exp2ni = 1, the solution is not unique: if y is a solution, so is 
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y + 2iri. Another way of looking at this is that if z = pexpiO, with p real, 
then y = In p + i9, and the angle 6 has the same 2ni ambiguity. Now there 
is no such thing as a "many valued function." By definition, a function is a 
machine into which we plug something and get a unique output. To make 
In z into a legitimate function we must select a unique 6 = arg z for each z. 
This can be achieved by cutting the z plane along a curve extending from 
the the branch point at z = all the way to infinity. Exactly where we put 
this branch cut is not important; what is important is that it serve as an 
impenetrable fence preventing us from following the continuous evolution of 
the function along a path that winds around the origin. 

Similar branch cuts serve to make fractional powers single valued. We 
define the power z a for for non-integral a by setting 

z a = exp{a\nz} = \z\ a e ia9 , (8.157) 

where z = \z\e ld . For the square root z 1 ^ 2 we get 

Z W = j\z\e ie l\ (8.158) 

where \/\z\ represents the positive square root of \z\. We can therefore make 
this single- valued by a cut from to oo. To make \JJz — a)(z — b) single 
valued we only need to cut from a to b. (Why? — think this through!). 

We can get away without cuts if we imagine the functions being maps from 
some set other than the complex plane. The new set is called a Riemann 
surface. It consists of a number of copies of the complex plane, one for each 
possible value of our "multivalued function." The map from this new surface 
is then single- valued, because each possible value of the function is the value 
of the function evaluated at a point on a different copy. The copies of the 
complex plane are called sheets, and are connected to each other in a manner 
dictated by the function. The cut plane may now be thought of as a drawing 
of one level of the multilayered Riemann surface. Think of an architect's floor 
plan of a spiral-floored multi-story car park: If the architect starts drawing 
at one parking spot and works her way round the central core, at some point 
she will find that the floor has become the ceiling of the part already drawn. 
The rest of the structure will therefore have to be plotted on the plan of the 
next floor up — but exactly where she draws the division between one floor 
and the one above is rather arbitrary. The spiral car-park is a good model 
for the Riemann surface of the \nz function. See figure 8.16. 
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Figure 8.16: Part of the Riemann surface for \nz. Each time we circle the 
origin, we go up one level. 

To see what happens for a square root, follow z 1 ^ 2 along a curve circling the 
branch point singularity at z — 0. We come back to our starting point with 
the function having changed sign; A second trip along the same path would 
bring us back to the original value. The square root thus has only two sheets, 
and they are cross-connected as shown in figure 8.17. 




Figure 8.17: Part of the Riemann surface for yfz. Two copies of C are cross- 
connected. Circling the origin once takes you to the lower level. A second 
circuit brings you back to the upper level. 

In figures 8.16 and 8.17, we have shown the cross-connections being made 
rather abruptly along the cuts. This is not necessary — there is no singularity 
in the function at the cut — but it is often a convenient way to think about 
the structure of the surface. For example, the surface for ^J{z — d)[z — b) 
also consists of two sheets. If we include the point at infinity, this surface 
can be thought of as two spheres, one inside the other, and cross connected 
along the cut from a to b. 

8.6.3 Topology of Riemann surfaces 

Riemann surfaces often have interesting topology. Indeed much of modern 
algebraic topology emerged from the need to develop tools to understand 
multiply-connected Riemann surfaces. As we have seen, the complex num- 
bers, with the point at infinity included, have the topology of a sphere. The 
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Figure 8.18: The 1-cycles a and (3 on the plane with two square-root branch 
cuts. The dashed part of a lies hidden on the second sheet of the Riemann 
surface. 

y/ (z — a){z — b) surface is still topologically a sphere. To see this imagine 
continuously deforming the Riemann sphere by pinching it at the equator 
down to a narrow waist. Now squeeze the front and back of the waist to- 
gether and (imagining that the the surface can pass freely through itself) fold 
the upper half of the sphere inside the lower. The result is the precisely the 
two-sheeted \J(z — a)(z — b) surface described above. The Riemann surface 
of the function ■sjjz — a) (z — b) (z — c)(z — d), which can be thought of a two 
spheres, one inside the other and connected along two cuts, one from a to 
b and one from c to d, is, however, a torus. Think of the torus as a bicycle 
inner tube. Imagine using the fingers of your left hand to pinch the front and 
back of the tube together and the fingers of your right hand to do the same 
on the diametrically opposite part of the tube. Now fold the tube about the 
pinch lines through itself so that one half of the tube is inside the other, 
and connected to the outer half through two square-root cross-connects. If 
you have difficulty visualizing this process, figures 8.18 and 8.19 show how 
the two 1-cycles, a and j3, that generate the homology group Hi(T 2 ) appear 
when drawn on the plane cut from a to b and c to d, and then when drawn on 
the torus. Observe, in figure 8.18, how the curves in the two-sheeted plane 
manage to intersect in only one point, just as they do when drawn on the 
torus in figure 8.19. 

That the topology of the twice-cut plane is that of a torus has important 
consequences. This is because the elliptic integral 

w = r 1 (z)= [ dt = (8.159) 

Jzo y/(t-a)(t-b)(t-c)(t-d) 

maps the twice-cut z-plane 1-to-l onto the torus, the latter being considered 
as the complex w-plane with the points w and w + nui + muji identified. The 
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two numbers w\p are given by 

dt 



y/(t - a)(t - b)(t - c)(t - d)' 

uj 2 = I . dt =, (8.160) 

h y/(t-a)(t-b)(t-c)(t-d) 

and are called the periods of the elliptic function z = I(w). The map w i— > 
z = I(w) is a genuine function because the original z is uniquely determined 
by w. It is doubly periodic because 

I(w + nooi + mLO?) = I(w), n,m£lj. (8.161) 

The inverse "function" w = I~ 1 {z) is not a genuine function of z, however, 
because w increases by oj\ or u 2 each time z goes around a curve deformable 
into a or ft, respectively. The periods are complicated functions of a, b, c, d. 

If you recall our discussion of de Rham's theorem from chapter 4, you 
will see that the u>i are the results of pairing the closed holomorphic 1-form. 

dz 

u dw" = - G H\T 2 ) (8.162) 

y/(z — a) (z — b) [z — c)(z — d) 

with the two generators of Hi(T 2 ). The quotation marks about dw are 
there to remind us that dw is not an exact form, i.e. it is not the exterior 
derivative of a single-valued function w. This cohomological interpretation 
of the periods of the elliptic function is the origin of the use of the word 
"period" in the context of de Rham's theorem. (See section 10.5 for more 
information on elliptic functions.) 

More general Riemann surfaces are oriented 2-manifolds that can be 
thought of as the surfaces of doughnuts with g holes. The number g is called 



8.6. ANALYTIC FUNCTIONS AND TOPOLOGY 



345 




OCj oc 2 cc 3 



Figure 8.20: A surface M of genus 3. The non-bounding 1-cycles oci and 
form a basis of H\(M). The entire surface forms the single 2-cycle that spans 
H 2 (M). 

the genus of the surface. The sphere has g = and the torus has g — 1. 
The Euler character of the Riemann surface of genus g is x = 2(1 — g). For 
example, figure 8.20 shows a surface of genus three. The surface is in one 
piece, so dim Hq(M) = 1. The other Betti numbers are dimifi(M) = 6 and 
dim H 2 (M) = 1, so 

2 

X = ^(-l) p dim H P {M) = 1 - 6 + 1 = -4, (8.163) 

in agreement with x — 2(1 — 3) = —4. For complicated functions, the genus 
may be infinite. 

If we have two complex variables z and w then a polynomial relation 
P(z,w) = defines a complex algebraic curve. Except for degenerate cases, 
this one (complex) dimensional curve is simultaneously a two (real) dimen- 
sional Riemann surface. With 

P(z,w) = z 3 + 3w 2 z + w + 3 = 0, (8.164) 

for example, we can think of z(w) being a three-sheeted function of w defined 
by solving this cubic. Alternatively we can consider w(z) to be the two- 
sheeted function of z obtained by solving the quadratic equation 

9 1 (3 + z 3 ) , 
w 2 + — w + ^- J - = 0. 8.165 

3z 3z 

In each case the branch points will be located where two or more roots 
coincide. The roots of (8.165), for example, coincide when 



1 - 12z(3 + z 3 ) = 0. 



(8.166) 
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This quartic equation has four solutions, so there are four square-root branch 
points. Although constructed differently, the Riemann surface for w(z) and 
the Riemann surface for z(w) will have the same genus (in this case g — 1) 
because they are really are one and the same object — the algebraic curve 
defined by the original polynomial equation. 

In order to capture all its points at infinity, we often consider a complex 
algebraic curve as being a subset of CP 2 . To do this we make the defining 
equation homogeneous by introducing a third co-ordinate. For example, for 
(8.164) we make 

P(z, w) = z 3 + 3w 2 z + w + 3 -> P(z, w, v) = z 3 + 3w 2 z + wv 2 + 3v 3 . (8.167) 

The points where P(z, w,v) = define 7 a projective curve lying in CP 2 . 
Places on this curve where the co-ordinate v is zero are the added points at 
infinity. Places where v is non-zero (and where we may as well set v — 1) 
constitute the original affine curve. 
A generic (non-singular) curve 

P(z, w) = ar S z r w s = 0, (8.168) 

r,s 

with its points at infinity included, has genus 

= 1(^-1)^-2). (8.169) 

Here d = max (r + s) is the degree of the curve. This degree-genus relation 
is due to Plucker. It is not, however, trivial to prove. Also not easy to prove 
is Riemann's theorem of 1852 that any finite genus Riemann surface is the 
complex algebraic curve associated with some two- variable polynomial. 

The two assertions in the previous paragraph seem to contradict each 
other. "Any" finite genus, must surely include g — 2, but how can a genus 
two surface be a complex algebraic curve? There is no integer value of d such 
that (d — l)(d — 2)/2 = 2. This is where the "non-singular" caveat becomes 
important. An affine curve P(z, w) = is said to be singular at P = (z , w ) 
if all of 

dP dP 



7 A homogeneous polynomial P(z,w,v) of degree n does not provide a map from 
CP 2 — > C because P(Xz, \w, Xv) = \ n P(z,w,v) usually depends on A, while the co- 
ordinates (Xz,Xw, Xv) and (z,w, v) correspond to the same point in CP 2 . The zero set 
where P = is, however, well-defined in CP 2 . 
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vanish at P. A projective curve is singular at P e CP 2 if all of 

, OP OP OP 
P{z,w,v), — , — , — 
az aw ov 

are zero there. If the curve has a singular point then then it degenerates and 
ceases to be a manifold. Now Riemann's construction does not guarantee 
an embedding of the surface into CP 2 , only an immersion. The distinction 
between these two concepts is that an immersed surface is allowed to self- 
intersect, while an embedded one is not. Being a double root of the defining 
equation P(z ) w) = 0, a point of self-intersection is necessarily a singular 
point. 

As an illustration of a singular curve, consider our earlier example of the 
curve 

w 2 = (z-a)(z-b)(z-c)(z-d) (8.170) 

whose Riemann surface we know to be a torus once two some points are 
added at infinity, and when a,b,c,d are all distinct. The degree-genus formula 
applied to this degree four curve gives, however, g — 3 instead of the expected 
g — 1. This is because the corresponding projective curve 

w 2 v 2 — (z — av)(z — bv)(z — cv)(z — dv) (8.171) 

has a tacnode singularity at the point (z,w,v) = (0,1,0). Rather than 
investigate this rather complicated singularity at infinity, we will consider 
the simpler case of what happens if we allow b to coincide with c. When b 
and c merge, the finite point P = (w , z ) = (0, b) becomes a singular. Near 
the singularity, the equation defining our curve looks like 

= w 2 -ad(z-b) 2 , (8.172) 

which is the equation of two lines, w = y/ad (z — b) and w = — \fad~ (z — b), 
that intersect at the point (w,z) = (0,6). To understand what is happening 
topologically it is first necessary to realize that a complex line is a copy of C 
and hence, after the point at infinity is included, is topologically a sphere. A 
pair of intersecting complex lines is therefore topologically a pair of spheres 
sharing a common point. Our degenerate curve only looks like a pair of 
lines near the point of intersection however. To see the larger picture, look 
back at the figure of the twice-cut plane where we see that as b approaches 
c we have an a cycle of zero total length. A zero length cycle means that 
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the circumference of the torus becomes zero at P, so that it looks like a 
bent sausage with its two ends sharing the common point P. Instead of two 
separate spheres, our sausage is equivalent to a single two-sphere with two 
points identified. 



Figure 8.21: A degenerate torus is tope-logically the same as a sphere with 
two points identified. 

As it stands, such a set is no longer a manifold because any neighbourhood of 
P will contain bits of both ends of the sausage, and therefore cannot be given 
co-ordinates that make it look like a region in IR 2 . We can, however, simply 
agree to delete the common point, and then plug the resulting holes in the 
sausage ends with two distinct points. The new set is again a manifold, and 
topologically a sphere. From the viewpoint of the pair of intersecting lines, 
this construction means that we stay on one line, and ignore the other as it 
passes through. 

A similar resolution of singularities allows us to regard immersed surfaces 
as non-singular manifolds, and it is this sense that Riemann's theorem is to 
be understood. When n such self-intersection double points are deleted and 
replaced by pairs of distinct points The degree-genus formula becomes 





(8.173) 



and this can take any integer value. 
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8.6.4 Conformal geometry of Riemann surfaces 

In this section we recall Hodge's theory of harmonic forms from section 4.7.1, 
and see how it looks from a complex variable perspective. This viewpoint 
reveals a relationship between Riemann surfaces and Riemann manifolds that 
forms an important ingredient in string and conformal field theory. 

Isothermal co-ordinates and complex structure 

Suppose we have a two-dimensional orientable Riemann manifold M with 
metric 

ds 2 = g ij dx i dx j . (8.174) 

In two dimensions has three independent components. When we make a 
co-ordinate transformation we have two arbitrary functions at our disposal, 
and so we can use this freedom to select local co-ordinates in which only one 
independent component remains. The most useful choice is isothermal (also 
called conformal) co-ordinates x, y in which the metric tensor is diagonal, 
gij = e a 5ij, and so 

ds 2 = e a (dx 2 + dy 2 ). (8.175) 

The e CT is called the scale factor or conformal factor. If we set z = x + iy 
and z = x — iy the metric becomes 

ds 2 = e^dzdz. (8.176) 

We can construct isothermal co-ordinates for some open neighbourhood of 
any point in M. If in an overlapping isothermal co-ordinate patch the metric 
is 

ds 2 = e^dCdC, (8.177) 

and if the co-ordinates have the same orientation, then in the overlap region 
C must be a function only of z and ( a function only of z. This is so that 

e T{a) d(d( = e a{z ^ 

without any d( 2 or d( terms appearing. A manifold with an atlas of complex 
charts whose change-of-co-ordinate formulae are holomorphic in this way is 
said to be a complex manifold, and the co-ordinates endow it with a complex 



dz 



dCdC 



(8.178) 
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structure. The existence of a global complex structure allows to us to de- 
fine the notion of meromorphic and rational functions on M. Our Riemann 
manifold is therefore also a Riemann surface . 

While any compact, orientable, two-dimensional Riemann manifold has 
a complex structure that is determined by the metric, the mapping: metric 
— > complex structure is not one-to-one. Two metrics g^, that are related 
by a conformal scale factor 

gij — X(x 1 ,x 2 )g ij (8.179) 

give rise to the same complex structure. Conversely, a pair of two-dimensional 
Riemann manifolds having the same complex structure have metrics that are 
related by a scale factor. 

The use of isothermal co-ordinates simplifies many computations. Firstly, 
observe that g %3 j \fg = 5ij, the conformal factor having cancelled. If you look 
back at its definition, you will see that this means that when the Hodge "*" 
map acts on one-forms, the result is independent of the metric. If u is a 
one-form 

uj = p dx + q dy, 



then 

•ku = —qdx + pdy. 

Note that, on one-forms, 

** = — 1. 

With z = x + iy, z = x — iy, we have 

u = hj)- iq) dz + hj) + iq) dz. 
Let us focus on the dz part: 



Then 



Similarly, with 



A = —{p — iq) dz = —{p — iq)(dx + idy). 



■kA — -{p — iq){dy — idx) = —iA. 



B = -(p + iq) dz 



8.180) 
8.181) 
8.182) 

8.183) 

8.184) 

8.185) 
8.186) 
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we have 

*B = iB. (8.187) 

Thus the dz and dz parts of the original form are separately eigenvectors of * 
with different eigenvalues. We use this observation to construct a resolution 
of the identity Id into the sum of two projection operators 

id = + + 

P + P, (8.188) 

where P projects on the dz part and P onto the dz part of the form. 

The original form is harmonic if it is both closed dou — 0, and co-closed 
d-kuo = 0. Thus, in two dimensions, the notion of being harmonic (i.e. a 
solution of Laplace's equation) is independent of what metric we are given. 
If uj is a harmonic form, then (p — iq)dz and (p + iq)dz are separately closed. 
Observe that (p — iq)dz being closed means that d-(p — iq) = 0, and so p — iq 
is a holomorphic (and hence harmonic) function. Since both (p — iq) and dz 
depend only on z, we will call (p — iq)dz a holomorphic 1-form. The complex 
conjugate form 

(p - iq)dz = (p + iq)dz (8.189) 
then depends only on ~z and is anti-holomorphic. 

Riemann bilinear relations 

As an illustration of the interplay of harmonic forms and two-dimensional 
topology, we derive some famous formuae due to Riemann. These formulae 
have applications in string theory and in conformal field theory. 

Suppose that M is a Riemann surface of genus g, with ctj, /3j ,i — 1, . . . , g, 
the representative generators of Hi(M) that intersect as shown in figure 8.20. 
By applying Hodge-de Rham to this surface, we know that we can select 
a set of 2g independent, real, harmonic, 1-forms as a basis of H 1 (M,M). 
With the aid of the projector P we can assemble these into g holomorphic 
closed 1-forms u>i, together with g anti-holomorphic closed 1-forms u7j, the 
original 2g real forms being recovered from these as a;, + cJj and -k(uOi + 
uJi) = i(pi — Ui). A physical interpretation of these forms is as the z and 
~z components of irrotational and incompressible fluid flows on the surface 
M. It is not surprising that such flows form a 2g real dimensional, or g 
complex dimensional, vector space because we can independently specify the 



352 



CHAPTER 8. COMPLEX ANALYSIS I 



circulation <fv-dr around each of the 2g generators of Hi(M). If the flow field 
has (covariant) components v x , v y , then u = v z dz where v z = (v x — iv y )/2, 
and uJ = v-zdz where v- = (v x + iv y )/2. 

Suppose now that a and b are closed 1-forms on M. Then, either by 
exploiting the powerful and general intersection-form formula (4.77) or by 
cutting open the surface along the curves a*, fa and using the more direct 
strategy that gave us (4.79), we find that 

f aA6 = V( / a f b- [ a [ b\ . (8.190) 

We use this formula to derive two bilinear relations associated with a closed 
holomorphic 1-form u. Firstly we compute its Hodge inner-product norm 




g 

= ^{AiBi-BiAi}, (8.191) 

i=i 

where — j a .uj and Bi = u. We have used the fact that uj is an anti- 
holomorphic 1 form and thus an eigenvector of * with eigenvalue %. It follows, 
therefore, that if all the Ai are zero then ||cj|| = and so uj = 0. 

Let Aij = J a Uj. The determinant of the matrix Aij is non-zero: If it 
were zero, then there would be numbers Aj, not all zero, such that 

= AijXj = ! (cjjXj), (8.192) 

J Cti 

but, by (8.191), this implies that ||<^jAj|| = and hence UjXj = 0, contrary 
to the linear independence of the a;,. We can therefore solve the equations 

AijXjk = 5 ik (8.193) 

for the numbers Xjk and use these to replace each of the ui by the linear 
combination ujjXji. The new uji then obey j a Uj = 5ij. From now on we 
suppose that this has be done. 
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Define = Uj. Observe that dz A dz = forces ut A uij = 0, and 
therefore we have a second relation 




T~mn Tnrrf (8.194) 

The matrix is therefore symmetric. A similar compuation shows that 

HAjWill 2 = 2Ai(Im7ij)Aj (8.195) 

so the matrix (Imr^) is positive definite. The set of such symmetric matrices 
whose imaginary part is positive definite is called the Siegel upper half-plane. 
Not every such matrix correponds to a Riemann surface, but when it does it 
encodes all information about the shape of the Riemann manifold M that is 
left invariant under conformal rescaling. 



8.7 Further Exercises and Problems 

Exercise 8.10: Harmonic partners. Show that the function 

u = sin x cosh y + 2 cos x sinh y 
is harmonic. Determine the corresponding analytic function u + iv. 
Exercise 8.11: Mobius Maps. The Map 

az + b 
z w = 

cz + a 

is called a Mobius transformation. These maps are important because they are 
the only one-to-one conformal maps of the Riemann sphere onto itself. 

a) Show that two successive Mobius transformations 

, _ az + b „ _ Az' + B 

Z = cz + d' Z = Cz' + D 

give rise to another Mobius transformation, and show that the rule for 
combining them is equivalent to matrix multiplication. 
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b) Let z±, Z2, 23, Z4 be complex numbers. Show that a necessary and suf- 
ficient condition for the four points to be concyclic is that their their 
cross-ratio 

r , dcf {Zi - Z 4 )(z 3 - Z 2 ) 

{Z 1 ,Z 2 ,Z 3 ,Z 4 } = ~, r- r 

\Z\ - Z 2 ){Z3 - Z4) 

be real (Hint: use a well-known property of opposite angles of a cyclic 
quadrilateral). Show that Mobius transformations leave the cross-ratio 
invariant, and thus take circles into circles. 

Exercise 8.12: Hyperbolic geometry. The Riemann metric for the Poincare- 
disc model of Lobachevski's hyperbolic plane (See exercises ??.?? and 3.13) 
can be taken to be 

ds *= 4 I^I 2 \ Z?<1 
as _ 2 , \z\ 1. 

a) Show that the Mobius transformation 

i \ z — a , , TO 

z 1 — ► w = c — , a < 1, A£K 

az — 1 

provides a 1-1 map of the interior of the unit disc onto itself. Show that 
these maps form a group. 

b) Show that the hyperbolic-plane metric is left invariant under the group 
of maps in part (a). Deduce that such maps are orientation-preserving 
isometrics of the hyperbolic plane. 

c) Use the circle-preserving property of the Mobius maps to deduce that 
circles in hyperbolic geometry are represented in the Poincare disc by 
Euclidean circles that lie entirely within the disc. 

The conformal maps of part (a) are in fact the only orientation preserving 
isometries of the hyperbolic plane. With the exception of circles centered at 
z = 0, the center of the hyperbolic circle does not coincide with the center 
of its representative Euclidean circle. Euclidean circles that are internally 
tangent to the boundary of the unit disc have infinite hyperbolic radius and 
their hyperbolic centers lie on the boundary of the unit disc and hence at 
hyperbolic infinity. They are known as horocycles. 

Exercise 8.13: Rectangle to Ellipse. Consider the map w 1— > z = sinu>. Draw 
a picture of the image, in the z plane, of the interior of the rectangle with 
corners u = ±7r/2, v = ±A. (w = u + iv). Show which points correspond to 
the corners of the rectangle, and verify that the vertex angles remain ir/2. At 
what points does the isogonal property fail? 
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Exercise 8.14: The part of the negative real axis where x < — 1 is occupied 
by a conductor held at potential —Vq. The positive real axis for x > +1 
is similarly occupied by a conductor held at potential +Vq. The conductors 
extend to infinity in both directions perpendicular to the x — y plane, and so 
the potential V satisfies the two-dimensional Laplace equation. 

a) Find the image in the £ plane of the cut z plane where the cuts run from 
— 1 to — oo and from +1 to +oo under the map z i— > £ = sin -1 z 

b) Use your answer from part a) to solve the electrostatic problem and 
show that the field lines and equipotentials are conic sections of the form 
ax 2 +by 2 = 1. Find expressions for a and b for the both the field lines and 
the equipotentials and draw a labelled sketch to illustrate your results. 

Exercise 8.15: Draw the image under the map z i— > w = e* z l a of the infinite 
strip S, consisting of those points z = x + iy G C for which < y < a. 
Label enough points to show which point in the w plane corresponds to which 
in the z plane. Hence or otherwise show that the Dirichlet Green function 
G(x,y; XQ,yo) that obeys 

V 2 G = S(x - x )5(y - y ) 
in S, and G(x, y; xo, yo) = for (x, y) on the boundary of S, can be written as 

G(x, y; x ,y ) = ^- In | sinh(7r(z - z )/2a)\ + ... 

The dots indicate the presence of a second function, similar to the first, that 
you should find. Assume that (a?o>yo) £ S. 

Exercise 8.16: State Laurent's theorem for functions analytic in an annulus. 
Include formulae for the coefficients of the expansion. Show that, suitably 
interpreted, this theorem reduces to a form of Fourier's theorem for functions 
analytic in a neighbourhood of the unit circle. 

Exercise 8.17: Laurent Paradox. Show that in the annulus 1 < \z\ < 2 the 
function 

fiz) = (z-m-z) 

has a Laurent expansion in powers of z. Find the coefficients. The part of the 
series with negative powers of z does not terminate. Does this mean that f(z) 
has an essential singularity at z = 0? 
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Exercise 8.18: Assuming the following series 



1 



1 



1 7 3 

-z H z + . . . 

6 ^16 



sinhz 



z 



evaluate the integral 



I 



l 



2 sinh z 



1 



dz. 



Now evaluate the integral 



I 



2 sinh z 



1 



dz. 



(Hint: The zeros of sinhz lie at z = niri.) 

Exercise 8.19: State the theorem relating the difference between the number 
of poles and zeros of f(z) in a region to the winding number of argument of 
f(z). Hence, or otherwise, evaluate the integral 



where C is the circle \z\ = 2. Prove, including a statement of any relevent 
theorem, any assertions you make about the locations of the zeros of z 5 + z+l. 

Exercise 8.20: Arcsine branch cuts. Let w = sin _1 2;. Show that 



with the ± being selected depending on whether n is odd or even. Where 
would you put cuts to ensure that w is a single- valued function? 

Problem 8.21: Cutting open a genus-2 surface. The Riemann surface for the 
function 



has genus g = 2. Such a surface M is sketched in figure 8.22, where the four 
independent 1-cycles oi\p, and (3\^ that generate H\(M) have been drawn so 
that they share a common vertex. 

a) Realize the genus-2 surface as two copies of C U {oo} cross-connected by 
three square-root branch cuts. Sketch how the 1-cycles ccj and i = 1, 2 
of figure 8.22 appear when drawn on your thrice-cut plane. 





y 



\/(z- ai)(z - a 2 )(z - a 3 )(z - a 4 )(z - a 5 )(z - a 6 ) 
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Figure 8.23: The cut-open genus-2 surface. The superscripts L and R denote 
respectively the left and right sides of each 1-cycle, viewed from the direction 
of the arrow orienting the cycle. 
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b) Cut the surface open along the four 1-cycles, and show that resulting 
surface is homeomorphic to the octagonal region appearing in figure 8.23. 

c) Apply the direct method that gave us (4.79) to the octagonal region of 
part b). Hence show that for closed 1-forms a, b, on the surface we have 

f aA6 = V{/ a I b-j a[ b\ . 

JM ~[ VJa, Jfi t J fa J a, J 



Chapter 9 

Complex Analysis II 



In this chapter we will apply what we have learned of complex variables. The 
applications will range from the elementary to the sophisticated. 



The goal of contour integration technology is to evaluate ordinary, real- 
variable, definite integrals. We have already met the basic tool, the residue 
theorem: 

Theorem: Let f(z) be analytic within and on the boundary T = dD of a 
simply connected domain D, with the exception of hnite number of points 
at which the function has poles. Then 



The effective application of the residue theorem is something of an art, but 
there are useful classes of integrals which we can learn to recognize. 

Rational Trigonometric Expressions 

Integrals of the form 



9.1 Contour Integration Technology 




poles G D 



2ni (residue at pole). 



9.1.1 Tricks of the Trade 




(9.1) 
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are dealt with by writing cos 9 = \{z + z), sin 9 = ^-.{z — z) and integrating 
around the unit circle. For example, let a, b be real and b < a, then 



J: <W 2 f (I. i 2 /' (h 

1*1 = 



(9.2) 



Iq a + bcos9 i J\ z \=i bz 2 + 2az + b ib J (z — ct)(z — (3) 
Since af3 = 1, only one pole is within the contour. This is at 

a = (-a + Va 2 - b 2 )/b. (9.3) 

The residue is 



1 



iba — (3 i ^a 2 - b 2 
Therefore, the integral is given by 

Va 2 - b 2 



(9.4) 



(9.5) 



These integrals are, of course, also do-able by the "t" substitution t = 
tan(#/2), whence 

2t „ 1 - 1 2 in 2dt , , 

followed by a partial fraction decomposition. The labour is perhaps slightly 
less using the contour method. 

Rational Functions 

Integrals of the form 

/•oo 

R(x) dx, (9.7) 



/ 



where R(x) is a rational function of x with the degree of the denominator 
exceeding the degree of the numerator by two or more, may be evaluated 
by integrating around a rectangle from —A to +A, A to A + iB, A + iB to 
—A + iB, and back down to —A. Because the integrand decreases at least 
as fast as l/\z\ 2 as z becomes large, we see that if we let A, B — > oo, the 
contributions from the unwanted parts of the contour become negligeable. 
Thus 

I = 2ni Residues of poles in upper half-planej . (9.8) 
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We could also use a rectangle in the lower half-plane with the result 

/ = — 2m Residues of poles in lower half-plane j , (9.9) 

This must give the same answer. 

For example, let n be a positive integer and consider 

<Lr (9.10) 



oo(l + * 2 ) n 



The integrand has an n-th order pole at z — ±i. Suppose we close the contour 
in the upper half-plane. The new contour encloses the pole at z = +i and 
we therefore need to compute its residue. We set z — i = ( and expand 

1 1 1 A K 



[l + z 2 ) n [(* + C) 2 + l] n (2iC) n V 2 

1 ii f K\ n(n + l) ( i( 



i + " t + o, hr +••• • 9.ii 



(2i() n \ V 2 / 2! V 2 

The coefficient of C" 1 is 



2 



1 n(n + 1) ■ ■ ■ (2n - 2) ( i\ n 1 1 (2n-2)! 



(2i)» (n-1)! V 2 / 2 2 «- 1 i((n-l)!) 2, 

The integral is therefore 



(9.12) 



2 2 ™- 2 ((n-l)!) 2 ' 1 ' 



These integrals can also be done by partial fractions. 



9.1.2 Branch-cut integrals 

Integrals of the form 

roc 

1=1 x a - 1 R(x)dx, (9.14) 
Jo 

where R(x) is rational, can be evaluated by integration round a slotted circle 
(or "key-hole") contour. 
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-+ 


■> *■ 

















Figure 9.1: A slotted circle contour T of outer radius A and inner radius e. 



A little more work is required to extract the answer, though. 
For example, consider 

poo x a-i 

1= / dx, 0<Rea<l. (9.15) 

Jo 1 + x 

The restrictions on the range of a are necessary for the integral to converge 
at its upper and lower limits. 

We take T to be a circle of radius A centred at z = 0, with a slot indenta- 
tion designed to exclude the positive real axis, which we take as the branch 
cut of z a_1 , and a small circle of radius e about the origin. The branch of 
the fractional power is defined by setting 

z a ~ l = exp[(a - l)(ln \z\ + id)], (9.16) 

where we will take 9 to be zero immediately above the real axis, and 27r 
immediately below it. With this definition the residue at the pole at z — — 1 
is e l7r ( a_1 ). The residue theorem therefore tells us that 

dz = 2me ni{a - 1] . (9.17) 

. l+z V ; 

The integral decomposes as 

~a— 1 P ~a—l rA a— 1 f ~a—l 

1 dz= i i dz + (1 - e™^) / dx-i dz. 

1 + Z J\ Z \ =A 1 + Z 'J e 1+X J\z\=el+Z 

(9.18) 
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As we send A off to infinity we can ignore the "1" in the denominator com- 
pared to the z, and so estimate 



v a-l 



\z\=A 



l + Z 



-dz 





I z a - 2 dz 




J\z\=A 



< 2ttA x A Rc W- 2 . 



(9.19) 



This tends to zero provided that Re a < 1. Similarly, provided < Re a, the 
integral around the small circle about the origin tends to zero with e. Thus 



We conclude that 



-e ma 2m = (1 - e 27 ^- 1 )) /. 
2tu 



7T 



(e wia - e - ™) sin7ra' 
Exercise 9.1: Using the slotted circle contour, show that 



(9.20) 
(9.21) 



I 



oo x p-l 
1 + X 2 



dx 



7T 



7T 



2sin(-7rp/2) 2 



— cosec (7175/2), < p < 2. 



Exercise 9.2: Integrate z a ~ l j(z — 1) around a contour T\ consisting of a semi- 
circle in the upper half plane together with the real axis indented at z = 
and z = 1 




Figure 9.2: The contour Fx. 



to get 



v a-l 



Ty z — 1 



00 g,a— 1 



x - 1 



dx — m + (cos 7ra + i sin 7ra) 



x + 1 
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As usual, the symbol P in front of the integral sign denotes a principal part 
integral, meaning that we must omit an infinitesimal segment of the contour 
symmetrically disposed about the pole at z = 1. The term —m comes from 
integrating around the small semicircle about this point. We get —1/2 of the 
residue because we have only a half circle, and that traversed in the "wrong" 
direction. Warning: this fractional residue result is only true when we indent 
to avoid a simple pole — i.e. one that is of order one. 

Now take real and imaginary parts and deduce that 

X a ~ 7T 

dx = , < Rea < 1, 

1 + x sm 7ra 



L 



and 



oo rf.a— 1 

P I dx = ncotira, < Rea < 1. 

J 1-x 



9.1.3 Jordan's Lemma 

We often need to evaluate Fourier integrals 



/oo 
e ikx R(x) dx (9.22) 
■oo 

with R(x) a rational function. For example, the Green function for the 
operator — d% + m 2 is given by 

/°° dh f ikx 



Suppose i G 1 and x > 0. Then, in contrast to the analogous integral 
without the exponential function, we have no flexibility in closing the contour 
in the upper or lower half-plane. The function e lkx grows without limit as 
we head south in the lower half-plane, but decays rapidly in the upper half- 
plane. This means that we may close the contour without changing the value 
of the integral by adding a large upper-half-plane semicircle. 
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>im \ 
R 






> -im 



Figure 9.3: Closing the contour in the upper half-plane. 



The modified contour encloses a pole at k — im, and this has residue 
i/(2m)e- mx . Thus 

G(x) = —e~ mx , x>0. (9.24) 
2m 

For x < 0, the situation is reversed, and we must close in the lower half-plane. 
The residue of the pole at k — —im is —i/(2m)e mx , but the minus sign is 
cancelled because the contour goes the "wrong way" (clockwise). Thus 

G{x) = —e +mx , x<0. (9.25) 
2m 

We can combine the two results as 

G(x) = — e~ m K (9.26) 
2m 

The formal proof that the added semicircles make no contribution to the 
integral when their radius becomes large is known as Jordan's Lemma: 
Lemma: Let T be a semicircle, centred at the origin, and of radius R. Sup- 
pose 

i) that f(z) is meromorphic in the upper half-plane; 

ii) that f(z) tends uniformly to zero as \z\ — > oo for < axgz < it; 
Hi) the number A is real and positive. 

Then 

e iXz f(z) dz^O, as R — > oo. (9.27) 
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To establish this, we assume that R is large enough that |/| < e on the 
contour, and make a simple estimate 



J e tXz f(z) dz 



rir/2 

< 2Re I e' XRsin6 d9 



o 



< 2Re r /2 e' 2XRe ^d6 



o 



= y(l-^)<y. (9.28) 

In the second inequality we have used the fact that (sin 9)/ 9 > 2/tt for angles 
in the range < 9 < tt/2. Since e can be made as small as we like, the lemma 
follows. 

Example: Evaluate 

1(a) = / — i — '-dx. (9.29) 

J -oo X 

We have 

1(a) = Im {jTf^idz}. (9.30) 

If we take a > 0, we can close in the upper half-plane, but our contour must 
exclude the pole at z — 0. Therefore 



f exp iaz f exp iaz f 6 exp iax f exp iax 
= / dz — I dz + I dx + dx. 

J\z\=K z J\z\=t z J-R x J e X 

(9.31) 

As R — > oo, we can ignore the big semicircle, the rest, after letting e — > 0, 
gives 

roc ^iax 

= -i 7T + p dx. (9.32) 



Again, the symbol P denotes a principal part integral. The —in comes from 
the small semicircle. We get —1/2 the residue because we have only a half 
circle, and that traversed in the "wrong" direction. (Remember that this 
fractional residue result is only true when we indent to avoid a simple pole— 
i.e one that is of order one.) 

Reading off the real and imaginary parts, we conclude that 



sm«i , „ / cos arc 



dx = 7T, P dx = 0, a > 0. (9.33) 
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No "P" is needed in the sine integral, as the integrand is finite at x — 0. 

If we relax the condition that a > and take into account that sine is an 
odd function of its argument, we have 

/ 00 sin ax 
ax = 7rsgna. (9.34) 
-oo X 

This identity is called Dirichlet's discontinuous integral. 

We can interpret Dirichlet's integral as giving the Fourier transform of 
the principal part distribution P(l/x) as 



P / dx = in sgn io. (9.35) 



x 



This will be of use later in the chapter. 
Example: 




Figure 9.4: Quadrant contour. 



We will evaluate the integral 

j e iz z a ~ x dz (9.36) 

about the first-quadrant contour shown above. Observe that when < a < 1 
neither the large nor the small arc makes a contribution, and that there are 
no poles. Hence, we deduce that 

= / e ix x a ~ x dx-i j e- y y a - 1 e {a - 1) -i i dy, < a < 1. (9.37) 
Jo Jo 
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Taking real and imaginary parts, we find 

/oo . 

x a ~ 1 cosxdx = T(a) cos ^— a J , < a < 1, 

roc 

J x^wnxdx = T(a) sin (-a) , < a < 1, (9.38) 

/■OO 

r(a) = / y a - l e~ y dy (9.39) 



where 



is the Euler Gamma function. 

Example: Fresnel integrals. Integrals of the form 



C(t) = [ cos(7Tx 2 /2)dx, (9.40) 
Jo 

S(t) = [ sm(7rx 2 /2) dx, (9.41) 
Jo 

occur in the theory of diffraction and are called Fresnel integrals after Au- 
gustin Fresnel. They are naturally combined as 

C(t) + iS(t) = [ e lnx2/2 dx. (9.42) 
Jo 

The limit as t — > oo exists and is finite. Even though the integrand does not 
tend to zero at infinity, its rapid oscillation for large x is just sufficient to 
ensure convergence. 1 

As t varies, the complex function C(t) +iS(t) traces out the Cornu Spiral, 
named after Marie Alfred Cornu, a 19th century French optical physicist. 



1 We can exhibit this convergence by setting x 2 = s and then integrating by parts to 

get 



The right hand side is now manifestly convergent as t — > oo. 



s 3/2 ■ 
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Figure 9.5: The Cornu spiral C(t) +iS(t) fort in the range —8 < t < 8. The 
spiral in the first quadrant corresponds to positive values oft. 

We can evaluate the limiting value 

POO 

C(oo) + zS(oo) = / e inx2/2 dx (9.43) 
Jo 

by deforming the contour off the real axis and onto a line of length L running 
into the first quadrant at 45°, this being the direction of most rapid decrease 
of the integrand. 



y 




X 



Figure 9.6: Fresnel contour. 



A circular arc returns the contour to the axis whence it continues to oo, but 
an estimate similar to that in Jordan's lemma shows that the arc and the 
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subsequent segment on the real axis make a negligeable contribution when L 
is large. To evaluate the integral on the radial line we set z = e l7r//4 s, and so 

e <7r / 4 oo poo i -I 







Figure 9.5 shows how C(t) + iS'(t) orbits the limiting point 0.5 + 0.5i and 
slowly spirals in towards it. Taking real and imaginary parts we have 

cos I — ^— j ax — J sm I — — I ax = -. (9.45) 



9.2 The Schwarz Reflection Principle 

Theorem (Schwarz): Let f(z) be analytic in a domain D where dD includes 
a segment of the real axis. Assume that f(z) is real when z is real. Then 
there is a unique analytic continuation of f into the region D (the mirror 
image of D in the real axis ) given by 



[ f(z), zeD, 
g{z) = l f(z), zeD, (9.46) 
y either, z6i, 




Figure 9.7: The domain D and its mirror image D. 



The proof invokes Morera's theorem to show analyticity, and then appeals 
to the uniqueness of analytic continuations. Begin by looking at a closed 
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contour lying only in D: 



f(z)dz, 



(9.47) 



o 



where C = {n(t)} is the image of C = {r](t)} C D under reflection in the 
real axis. We can rewrite this as 



f(z) dz 



c 



dr) 



f(v) dz = 0. 



(9.48) 



c 



At the last step we have used Cauchy and the analyticity of / in D. Morera's 
theorem therefore confirms that g(z) is analytic in D. By breaking a general 
contour up into parts in D and parts in D, we can similarly show that g(z) 
is analytic in D U D. 

The important corollary is that if f(z) is analytic, and real on some 
segment of the real axis, but has a cut along some other part of the real axis, 
then f{x + ie) = f(x — ie) as we go over the cut. The discontinuity disc / is 
therefore 21m f(x + ie). 

Suppose f(z) is real on the negative real axis, and goes to zero as \z\ — > oo, 
then applying Cauchy to the contour T depicted in figure 9.8. 




Figure 9.8: The contour F for the dispersion relation. . 



we find 



1 f<x> 

/(C) = - 



7T 



Im f(x + ie) 
x — ( 



dx, 



(9.49) 
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for £ within the contour. This is an example of a dispersion relation. The 
name comes from the prototypical application of this technology to optical 
dispersion, i.e. the variation of the refractive index with frequency. 

If f(z) does not tend to zero at infinity then we cannot ignore the con- 
tribution to Cauchy's formula from the large circle. We can, however, still 
write 

^=^,.fM dz ' (9 ' 5o) 

and 

f( b ) = ^-<(^ldz, (9.51) 



2iri J r z — b 

for some convenient point b within the contour. We then subtract to get 

/(C) = /(6)+ (^)/ r _|W_ d , (9 ,2) 

Because of the extra power of z downstairs in the integrand, we only need / 
to be bounded at infinity for the contribution of the large circle to tend to 
zero. If this is the case, we have 

This is called a once- subtracted dispersion relation. 

The dispersion relations derived above apply when ( lies within the con- 
tour. In physics applications we often need f(() for ( real and positive. What 
happens as ( approaches the axis, and we attempt to divide by zero in such 
an integral, is summarized by the Plemelj formulas: If f(Q is defined by 

/(C) = - / ^dz, (9.54) 
where T has a segment lying on the real axis, then, if x lies in this segment, 
-(f(x + ie) -f(x-ie)) = ip{x) 

hf( x + ie)+f(x-ie)) = PfP^ldx*. (9.55) 

Z 7T Jy X — X 

As always, the "P" means that we are to delete an infinitesimal segment of 
the contour lying symmetrically about the pole. 
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= Q 



= 2 



Figure 9.9: Origin of the Plemelj formulae. 

The Plemelj formulae hold under relatively mild conditions on the function 
p(x) . We won't try to give a general proof, but in the case that p is analytic 
the result is easy to understand: we can push the contour out of the way 
and let £ — > £ on the real axis from either above or below. In that case 
the drawing above shows how the the sum of these two limits gives the the 
principal-part integral and how their difference gives an integral round a 
small circle, and hence the residue p(x). 

The Plemelj equations usually appear in physics papers as the u ie" cabala 



1 



x' — x ± ie 



P 



1 



X — X 



=F iir5{x' — x). 



(9.56) 



A limit e — > is always to be understood in this formula. 




Figure 9.10: Sketch of the real and imaginary parts of f(x') = l/(x' — x — ie) 



We can also appreciate the origin of the ie rule by examining the following 
identity: 

1 = x ~ x ± ie (9.57) 

x' — (x ± ie) [x' — x) 2 + e 2 (x f — x) 2 + e 2 
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The first term is a symmetrically cut-off version of 1/ (V — x) and provides 
the principal-part integral. The second term sharpens and tends to the delta 
function ±i7iS(x' — x) as e — > 0. 

Exercise 9.3: The Legendre function of the second kind Q n (z) may be defined 
for positive integer n by the integral 

If 1 fi _ +2\ra 

Show that for x € [—1, 1] we have 

QnOr + ie) - Q n (x - ie) = -mP n {x), 
where P n (x) is the Legendre Polynomial. Deduce Neumann 's formula 

9.2.1 Kramers-Kronig Relations 

Causality is the usual source of analyticity in physical applications. If G(t) 
is a response function 

/oo 
G(t - 0/cause(0 dt' (9.58) 
■oo 

then for no effect to anticipate its cause we must have G(t) = for t < 0. 
The Fourier transform 

/oo 
e lwt G{t) dt, (9.59) 
-oo 

is then automatically analytic everywhere in the upper half plane. Suppose, 
for example, we look at a forced, damped, harmonic oscillator whose dis- 
placement x(t) obeys 

x + 2 1 x+(n 2 + 1 2 )x = F(t), (9.60) 

where the friction coefficient 7 is positive. As we saw earlier, the solution is 
of the form 

/oo 
G(t,t')F(t')dt', 
-00 
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where the Green function G(t,t') = if t < t' . In this case 

G(t,t') = \ (9.61) 
[ 0, t<? 

and so 

x(t) = ^ / e -7(t-f ) S in Q(i - *') F(t') eft'. (9.62) 

Because the integral extends only from to +oo, the Fourier transform of 
G(t,0), 

1 f°° 

G(u) = - e^e^smtttdt, (9.63) 
" Jo 

is nicely convergent when Imw > 0, as evidenced by 

d ^ = > + 4-n- (9 ' 64) 

having no singularities in the upper half-plane? 

Another example of such a causal function is provided by the complex, 
frequency-dependent, refractive index of a material n(cu). This is defined so 
that a travelling wave takes the form 

V?(x, t) = e in ^> x -^. (9.65) 
We can decompose n into its real and imaginary parts 

n(ijj) = nji{uj) + ini{uj) 

= ««M + ^H (9.66) 

where 7 is the extinction coefficient, defined so that the intensity falls off 
as / oc exp(— 7n ■ x), where n = k/|fc| is the direction of propapagation. A 
non-zero 7 can arise from either energy absorption or scattering out of the 
forward direction 



2 If a pole in a response function manages to sneak into the upper half plane, then 
the system will be unstable to exponentially growing oscillations. This may happen, for 
example, when we design an electronic circuit containing a feedback loop. Such poles, and 
the resultant instabilities, can be detected by applying the principle of the argument from 
the last chapter. This method leads to the Nyquist stability criterion. 
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Being a causal response, the refractive index extends to a function ana- 
lytic in the upper half plane and n(u) for real uj is the boundary value 

^Mphysicai = limn(a> + ie) (9.67) 

of this analytic function. Because a real (E = E*) incident wave must give 
rise to a real wave in the material, and because the wave must decay in the 
direction in which it is propagating, we have the reality conditions 

^(—oj + ie) = -7(0; -Me), 
tir{— uj + ie) = +n R {uj + ie) (9.68) 

with 7 positive for positive frequency. 

Many materials have a frequency range \uj\ < |o; m i n | where 7 = 0, so 
the material is transparent. For any such material n(u) obeys the Schwarz 
reflection principle and so there is an analytic continuation into the lower 
half-plane. At frequencies u where the material is not perfectly transparent, 
the refractive index has an imaginary part even when u is real. By Schwarz, n 
must be discontinuous across the real axis at these frequencies: n{uj + ie) = 
n R + in j 7^ n(oj — ie) = n R — irij. These discontinuities of 2inj usually 
correspond to branch cuts. 

No substance is able to respond to infinitely high frequency disturbances, 
so n — > 1 as \oj\ — ► 00, and we can apply our dispersion relation technology 
to the function n — 1. We will need the contour shown below, which has cuts 
for both positive and negative frequencies. 
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Figure 9.11: Contour for the n — 1 dispersion relation. 
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By applying the dispersion-relation strategy, we find 

n(w) = 1 + - / , ; do/ + - / ] K ; do/ (9.69) 



7T ./-no UT ~ 7T / (J . . 0/ — 



for within the contour. Using Plemelj we can now take onto the real axis 
to get 

n fl (w) = 1 + -/ , duj'+— - v 7 da; 7 



7r /_„ ' — a; TT J., . UJ'—UJ 



J w 2 . a;' -a; 2 

mm 



mi n 



In the second line we have used the anti-symmetry of n/(o;) to combine the 
positive and negative frequency range integrals. In the last line we have used 
the relation u/k = c to make connection with the way this equation is written 
in R. G. Newton's authoritative Scattering Theory of Waves and Particles. 
This relation, between the real and absorptive parts of the refractive index, 
is called a Kramers- Kronig dispersion relation, after the original authors. 3 

If n — > 1 fast enough that uu 2 (n — 1) — > as \uo\ — > oo, we can take the / 
in the dispersion relation to be u 2 (n — 1) and deduce that 

, =1+ £,r (, 71) 

another popular form of Kramers-Kronig. This second relation implies the 
first, but not vice-versa, because the second demands more restrictive be- 
havior for n{uj). 

Similar equations can be derived for other causal functions. A quantity 
closely related to the refractive index is the frequency-dependent dielectric 
"constant" 

e(u) = ei +ie 2 . (9.72) 
Again e — > 1 as \cu\ — > oo, and, proceeding as before, we deduce that 



-/ 



€ 1 ( U ) = 1 + - I -g^-du/ 2 . (9.73) 



2 a/ — or 

min 



3 H. A. Kramers, Nature, 117 (1926) 775; R. de L. Kronig, J. Opt. Soc. Am. 12 (1926) 
547 
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9.2.2 Hilbert transforms 

Suppose that f(x) is the boundary value on the real axis of a function every- 
where analytic in the upper half-plane, and suppose further that f(z) — > 
as \z\ — > oo there. Then we have 



for z in the upper half-plane. This is because may close the contour with an 
upper semicircle without changing the value of the integral. For the same 
reason the integral must give zero when z is taken in the lower half-plane. 
Using the Plemelj formulae we deduce that on the real axis, 



We can use this strategy to derive the Kramers- Kronig relations even if m 
never vanishes, and so we cannot use the Schwarz reflection principle. 

The relation (9.75) suggests the definition of the Hilbert transform. Hip, 
of a function i/)(x), as 



Note the interchange of x, x' in the denominator of (9.76) when compared 
with (9.75). This switch is to make the Hilbert transform into a convolution 
integral. Equation (9.75) shows that a function that is the boundary value of 
a function analytic and tending to zero at infinity in the upper half-plane is 
automatically an eigenvector of H with eigenvalue —i. Similarly a function 
that is the boundary value of a function analytic and tending to zero at 
infinity in the lower half-plane will be an eigenvector with eigenvalue +i. (A 
function analytic in the entire complex plane and tending to zero at infinity 
must vanish identically by Liouville's theorem.) 

Returning now to our original /, which had eigenvalue —i, and decom- 
posing it as f(x) = fn(x) + ifi(x) we find that (9.75) becomes 




(9.74) 




(9.75) 




dx' . 



(9.76) 



fl(x) = Wr){x), 

f R (x) = -Wi){x). 



(9.77) 
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Conversely, if we are given a real function u(x) and set v(x) = (Hu)(x), 
then, under some mild restrictions on u (that it lie in some L P (IR), p > 1, for 
example, in which case v(x) is also in L P (M).) the function 

m= ™L 1-* * (9 - 78) 

will be analytic in the upper half plane, tend to zero at infinity there, and 
have u(x) + iv(x) as its boundary value as z approaches the real axis from 
above. The last line of (9.77) therefore shows that we may recover u(x) from 
v(x) as u(x) = -(Hv)(x). The Hilbert transform H : L P (R) -> L P (M) is 
therefore invertible, and its inverse is given by THr 1 = —H. (Note that the 
Hilbert transform of a constant is zero, but the L P (IR) condition excludes 
constants from the domain of Ji, and so this fact does not conflict with 
invertibility.) 

Hilbert transforms are useful in signal processing. Given a real signal 
Xp>(t) we can take its Hilbert transform so as to find the corresponding 
imaginary part, Xj(t), which serves to make the sum 

Z{t) = X R (t) + iX^t) = A(t)e 1 ^ (9.79) 

analytic in the upper half-plane. This complex function is the analytic sig- 
nal. 4 The real quantity A(t) is then known as the instantaneous amplitude, 
or envelope, while (pit) is the instantaneous phase and 

uj w (t) = <p(t) (9.80) 

is called the instantaneous frequency (IF). These quantities are used, for 
example, in narrow band FM radio, in NMR, in geophysics, and in image 
processing. 

Exercise 9.4: Let fiyi) = e tujt f(t) dt denote the Fourier transform of fit). 
Use the formula (9.35) for the Fourier transform of P(l/t), combined with the 
convolution theorem for Fourier transforms, to show that the Fourier transform 
of the Hilbert transform of f(t) is 

(w7)H = isgnH7(a;). 

Deduce that the analytic signal is derived from the original real signal by 
suppressing all positive frequency components (those proportional to e~ lwt 
with uj > 0) and multiplying the remaining negative-frequency amplitudes by 
two. 



l B. Gabor, J. Inst. Elec. Eng. (Part 3), 93 (1946) 429-457. 
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Exercise 9.5: Suppose that <fi(x) and f 2 (x) are real functions with finite 
L 2 (IR) norms. 

a) Use the Fourier transform result from the previous exercise to show that 

(<Pi,<P2) = (Hipi,Hip 2 }- 

Thus, 7i is a unitary transformation from L 2 (1R) — > L 2 (M). 

b) Use the fact that Ti 2 = —I to deduce that 

(H<pi,<p2) = -((pi,Hip 2 ) 

and so Tft = —H. 

c) Conclude from part b) that 

£«K'£^*)*-£*»('£^*)* 

i.e., for L 2 (R), functions, it is legitimate to interchange the order of "P" 
integration with ordinary integration. 

d) By replacing (fi(x) by a constant, and <p 2 (x) by the Hilbert transform 
of a function / with f f dx / 0, show that it is not always safe to 
interchange the order of "P" integration with ordinary integration 

Exercise 9.6: Suppose that are given real functions u±(x) and u 2 {x) and sub- 
stitute their Hilbert transforms v\ = Hu±, v 2 = Hu 2 into (9.78) to construct 
analytic functions fi{z) and f 2 (z). Then the product fi(z)f 2 (z) = F(z) has 
boundary value 

Fr(x) + iFi{x) = (uiu 2 - viv 2 ) + i(u\v 2 + u 2 v{). 

By assuming that F(z) satisfies the conditions for (9.77) to be applicable to 
this boundary value, deduce that 

H((Hui)u 2 ) + H((Hu 2 )u\) — (Hui)(Hu 2 ) = —u\u 2 . * 

This result 5 sometimes appears in the physics literature 6 in the guise of the 
distributional identity 

P P P P P P 

1 1 = — 7T 0{X — y)0{X — Z), irk 

x—yy—z y—zz—x z—xx—y 

5 F. G. Tricomi, Quart. J. Math. (Oxford), (2) 2, (1951) 199. 

6 For example, in R. Jackiw, A. Strominger, Phys. Lett. 99B (1981) 133. 
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where P/ (x — y) denotes the principal-part distribution p(l/(x-y)). This 
attractively symmetric form conceals the fact that a specific order of inte- 
gration is to be understood. As the next exercise shows, were we to freely 
re-arrange the integration order we could use the identity 

11 11 11 

+ + = 



x—yy—z y—zz—x z—xx—y 
to wrongly conclude that the right-hand side is zero. 

Exercise 9. 7: Show that the identity * from exercise 9.6 can be written as 

<pi(y)<p2(z) 



(z-y)(y-x) 



dz ) dy = ii ill (T-tt- X ) dy ) ^nwrt*), 

principal-part integrals being understood where necessary. This is a special 
case of a more general change-of-integration-order formula 

r ( r fix ' y ' z) dz) dy = r ( r ^ dz-^ f ( X , x , x) , 

J-oo \J-oo (z - y)(y - x) J J -oo \J -oo (z - y)(y - x) J 

which is due to G. H. Hardy (1908). Show that Hardy's formula is equivalent 
to the distributional identity **. 

Exercise 9.8: Use the licit interchange of "P" integration with ordinary inte- 
gration to show that 

r <p( X ) (p r ^-dy\ 2 dx=^ r ^dx. 

J -oo V J-oo x y J 6 ,/— oo 

Exercise 9.9: Let f(z) be analytic within the unit circle, and let u{9) and 
v{9) be the boundary values of its real and imaginary parts, respectively, at 
z = e %e . Use Plemelj to show that 

u(9) = -^PJ o o(0')cot(-^-J + — ] <9')d9>, 
V{9) = 2^ P J U{e ' )c0t {—) de,+ 2^j V{e ' )d6 '- 



9.3 Partial- Fraction and Product Expansions 

In this section we will study other useful representations of functions which 
devolve from their analyticity properties. 
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9.3.1 Mittag-Leffler Partial- Fraction Expansion 

Let f(z) be a meromorphic function with poles (perhaps infinitely many) 
at z — Zj, (j = 1,2,3,...), where \zi\ < |-2 2 | < . . .. Let r n be a contour 
enclosing the first n poles. Suppose further (for ease of description) that the 
poles are simple and have residue r n . Then, for z inside T n , we have 



" : j = i 



We often want to to apply this formula to trigonometric functions whose 
periodicity means that they do not tend to zero at infinity. We therefore 
employ the same subtraction strategy that we used for dispersion relations. 
We subtract 

™-™=M.^" + P>{7h + h)- (9 - 82) 

If we now assume that f(z) is uniformly bounded on the T n — this meaning 
that |/(2)| < A on L„, with the same constant A working for all n — then 
the integral tends to zero as n becomes large, yielding the partial fraction, 
or Mittag-Leffler, decomposition 

/(*) = /(°) + f>; (7^ + 7) ( 9 - 83 ) 

Example 1): Look at cosec z. The residues of l/(sinz) at its poles at z = nn 
are r n = (—1)™. We can take the T n to be squares with corners (n+l/2)(±l± 
i)n. A bit of effort shows that cosec is uniformly bounded on them. To use 
the formula as given, we first need subtract the pole at z — 0, then 



cosec z 



1 00 ' / 1 1 \ 

-= V (-l)M + — . (9.84) 

' » 7 — mr mr 1 



The prime on the summation symbol indicates that we are omit the n = 
term. The positive and negative n series converge separately, so we can add 
them, and write the more compact expression 

1 00 1 
cosec z = - + 2zV(-l) n - — . (9.85) 
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Example 2): A similar method gives 



1 1 A 

- ? 

We can pair terms together to writen this as 



1 ( 1 1 \ 

cot z = - + V + — . (9.86) 

z ^ \z — nn nn J 



cot z 



n=l 

oo 



1 



2; — nn z + nn 



z* — Tl^TT 
n=l 

or 

N 



= - + E 2 ^2 2 ( 9 - 87 ) 



lim V — - — . (9.88) 



cot z 

Af->oo ' z — nn 

n=-N 

In the last formula it is important that the upper and lower limits of summa- 
tion be the same. Neither the sum over positive n nor the sum over negative 
n converges separately. By taking asymmetric upper and lower limits we 
could therefore obtain any desired number as the limit of the sum. 

Exercise 9.10: Use Mittag-Leffler to show that 



coscx-- : - 

^— ' -i-i 



2 . = , 

(z + nn)' 



Now use this infinite series to give a one-line proof of the trigonometric identity 

N-l 

^2 cosec 2 (z + ^) = iV 2 cosec 2 (iVz). 



m=0 



(Is there a comparably easy elementary derivation of this finite sum?) Take a 
limit to conclude that 



I>ec>(^)=i(^-l) 



m=l 

Exercise 9.11: From the partial fraction expansion for cotz, deduce that 
ln[(sin z)/z\ = ^J2 Hz 2 ~ n 2 n 2 ). 

iXZ LLZ 

n=l 
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Integrate this along a suitable path from z = 0, and so conclude that that 

nf 1 ~ 



sin z = z 

71=1 



n 2 7r 2 



Exercise 9.12: By differentiating the partial fraction expansion for cot z, show 
that, for k an integer > 1, and Imz > 0, we have 



oo 



V> 1 _ (-2m) k+1 ^ k 2ninz 

^ (z + n) fc +! k\ ^ 

i=— oo v ' n=l 



This is called Lipshitz' formula. 

Exercise 9.13: The Bernoulli numbers are defined by 



„2fc 



X' 

2fc" 



The first few are B x = -1/2, B 2 = 1/6, 5 4 = -1/30. Except for B x , the B n 
are zero for n odd. Show that 



2ix 4^, 2™x : 



x cot x = ix + ^ _ - = 1 - ^(-l) fe+1 5 2fe 



2kjlk 



(2 k y. • 

By expanding l/(x 2 — n 2 ir 2 ) as a power series in x and comparing coefficients, 
deduce that, for positive integer k, 

00 i 9 2fc-l 

E 1 ( -i\k+12k z R 

^ = ( " 1} * MT 2fe ' 

n=l 

Exercise 9.14: Euler-Maclaurin sum formula. Use the formal expansion 
with D interpreted as d/dx, to obtain 

1 1 f'fxl 1 f( 4 ) 

(-/'(*) - /'(* + 1) - /'(* + 2) + ■ ■ ■) = /(x) - -/' (x) + -UZI - -L- + . . . . 

By integrating this from a to b = a + m, motivate the Euler-Maclaurin formula 
m-l „& 1 oo R 

E/(«+ fc ) = / +*(/(«) -/(&)) + E 7 |TT(/ (2fe " 1) ( a )-/ (2fc " 1) ( fe ))- 

fc =o y « 2 fc=i W 

This "derivation," while suggestive, is only heuristic. It gives no insight into 
whether the series converges (it usually does not) or what the error might be 
if we truncate after a finite number of terms. 
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9.3.2 Infinite Product Expansions 

We can play a variant of the Mittag-Lefner game with suitable entire func- 
tions g(z) and derive for them a representation as an infinite product. Sup- 
pose that g(z) has simple zeros at Z{. Then (lng)' = g'(z)/g(z) is meromor- 
phic with poles at Zi, all with unit residues. Assuming that it satisfies the 
uniform boundedness condition, we now use Mittag Leffler to write 



: j | \ Z j Z J 



E 7377 + 7: ' ^ 



(9.90) 



dz g{z) 
Integrating up we have 

\ng{z) = In 3(0) + cz + (W ~ z/zj) + -) , 

j= i V z jJ 

where c = g'(0)/g(0). We now re-exponentiate to get 

OO , x 

g(z)=g(0)e cz H l--) e ^. (9.91) 

Example: Let g(z) = sinz/z, then g(0) = 1, while the constant c, which is 
the logarithmic derivative of g at z = 0, is zero, and 

oo 

!E£ = FT f i _ A) e */- C ! + M e -,/™ (9>92) 

z - LJ -V nit J V mr/ 

n=l 

Thus 



sm z = z 

n=l 



Convergence of Infinite Products 

We have derived several infinite problem formulae without discussing the issue 
of their convergence. For products of terms of the form (1 + a n ) with positive 
a n we can reduce the question of convergence to that of Y^=i a n- 
To see why this is so, let 

N 

PN = Y[(l + a n ), a n >0. (9.94) 

n=l 
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Then we have the inequalities 

N ( N ~\ 

i + ^2 ° n < pn < exp \ S an f • ( 9,9S ) 

n=l I n=l J 

The infinite sum and product therefore converge or diverge together. If 

oo 

P = J](l + |a n |), (9.96) 

n=l 

converges, we say that 

oo 

P = Y[(l + a n ), (9.97) 

71=1 

converges absolutely. As with infinite sums, absolute convergence implies 
convergence, but not vice-versa. Unlike infinite sums, however, an infinite 
product containing negative a n can diverge to zero. If (1 + a n ) > then 
fj(l + a n ) converges if ^m(l + a n ) does, and we will say that n(l + a n) 
diverges to zero if X] m (l + °n) diverges to — oo. 

Exercise 9.15: Show that 

N 



n=2 

From these deduce that 



n=l x ' 

n( i_ n) = W 



n=2 

Exercise 9.16: For \z\ < 1, show that 



n=0 



1 - Z 



(Hint: think binary) 
Exercise 9.17: For \z\ < 1, show that 

oo oo 

rK i +^)=rir3^- 

n=l 

(Hint: 1 - x 2n = (1 - x n )(l + x n ).) 



n=l n=l 

„2n 
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9.4 Wiener-Hopf Equations II 

The theory of Hilbert transforms has shown us some the consequences of 
functions being analytic in the upper or lower half-plane. Another applica- 
tion of these ideas is to Wiener-Hopf equations. Although we have discussed 
Wiener-Hopf integral equations in chapter ??, it is only now that we pos- 
sess the tools to appreciate the general theory. We begin, however, with 
the slightly simpler Wiener-Hopf sum equations, which are their discrete 
analogue. Here, analyticity in the upper or lower half-plane is replaced by 
analyticity within or without the unit circle. 

9.4.1 Wiener-Hopf Sum Equations 

Consider the infinite system of equations 

oo 

Vn = ^2 a n-™ x m, -oo < n < OO (9.98) 

m=— oo 

where we are given the y n and are seeking the x n . 

If the On, Vn are the Fourier coefficients of smooth complex- valued func- 
tions 

oo 

A(6) = a « e ^' 

n=— oo 
oo 

y{9) = y^ nd i ( 9 -") 

n=— oo 

then the systems of equations is, in principle at least, easy to solve. We 
introduce the function 

oo 

X(6) = x n ein9 > (9- 100 ) 

n=— oo 

and (9.98) becomes 

Y(0)=A(0)X(6). (9.101) 

From this, the desired x n may be read off as the Fourier expansion coefficients 
of Y{0)/A{d). We see that A(9) must be nowhere zero or else the operator A 
represented by the infinite matrix a„_ m will not be invertible. This technique 
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is a discrete version of the Fourier transform method for solving the integral 
equation 

/oo 
A(s - t)x(t) dt, -oo < s < oo. (9.102) 
■oo 

The connection with complex analysis is made by regarding A(9),X(9), Y(9) 
as being functions on the unit circle in the z plane. If they are smooth enough 
we can extend their definition to an annulus about the unit circle, so that 

oo 

A(z) = J2 a " zn i 

n=—oo 

oo 

X(z) = Yl 

n=— oo 

oo 

y(z) = y^ n - ( 9 - 103 ) 

n=— oo 

The x n may now be read off as the Laurent expansion coefficients of Y(z) /A(z). 
The discrete analogue of the Wiener-Hopf integral equation 

POO 

y(s)= / A(s-t)x(t)dt, 0<s<oo (9.104) 
Jo 

is the Wiener-Hopf sum equation 

oo 

Vn = ^ a n - m x m , < n < oo. (9.105) 

m=0 

This requires a more sophisticated approach. If you look back at our earlier 
discussion of Wiener-Hopf integral equations in chapter ??, you will see that 
the trick for solving them is to extend the definition y(s) to negative s (anal- 
ogously, the y n to negative n) and find these values at the same time as we 
find x(s) for positive s (analogously, the x n for positive n.) 

We proceed by introducing the same functions A(z),X(z), Y(z) as before, 
but now keep careful track of whether their power-series expansions contain 
positive or negative powers of z. In doing so, we will discover that the 
Fredholm alternative governing the existence and uniqueness of the solutions 
will depend on the winding number iV = n(T, 0) where T is the image of the 
unit circle under the map z i— > A(z) — in other words, on how many times 
A(z) wraps around the origin as z goes once round the unit circle. 
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Suppose that A(z) is smooth enough that it is analytic in an annulus 
including the unit circle, and that we can factorize A(z) so that 

A{z)=\ q+ {z)z N [q_{z)]-\ 

where 

oo 

q+(z) = l + ^g+z n , 

n=l 

oo 

q.(z) = l + J2<l-nZ- n - (9-107) 

n=l 

Here we demand that q+(z) be analytic and non-zero for \z\ < 1 + e, and 
that q~(z) be analytic and non-zero for \l/z\ < 1 + e. These no pole, no 
zero, conditions ensure, via the principle of the argument, that the winding 
numbers of q±(z) about the origin are zero, and so all the winding of A(z) is 
accounted for by the TV-fold winding of the z N factor. The non-zero condition 
also ensures that the reciprocals [<7±(-2)] _1 have same class of expansions (i.e. 
in positive or negative powers of z only) as the direct functions. 

We now introduce the notation and [F(z)]_, meaning that we 

expand F(z) as a Laurent series and retain only the positive powers of z 
(including z°), or only the negative powers (starting from z^ 1 ), respectively. 
ThusF(^) = [F(z)]+ + [F(z)]-. We will write Y±(z) = [Y(z)]±, and similarly 
for X(z). We can therefore rewrite (9.105) in the form 

Xz N q + (z)X + = [Y + (z) + Y_{z)]q_{z). (9.108) 

If N > 0, and we break this equation into its positive and negative powers, 
we find 

[Y + q.} + = Xz N q + (z)X +: 

[r + g„]_ = -Y_q_(z). (9.109) 

From the first of these equations we can read off the desired x n as the positive- 
power Laurent coefficients of 

X + (z) = [Y + q_] + (\z N q+ (z))- 1 . (9.110) 

As a byproduct, the second alows us to find the coefficient of Y_(z). 
Observe that there is a condition on Y + for this to work: the power series 



(9.106) 
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expansion of \z N q + (z)X + starts with z N , and so for a solution to exist the 
first N terms of (Y" + g_) + as a power series in z must be zero. The given 
vector y n must therefore satisfy N consistency conditions. A formal way of 
expressing this constraint begins by observing that it means that the range of 
the operator A represented by the matrix a„_ m falls short, by N dimensions, 
of the being the entire space of possible y n . This is exactly the situation that 
the notion of a "cokernel" is intended to capture. Recall that if A : V — > V, 
then Coker A = V/lmA. We therefore have 

dim [Coker A] = N. 

When N < 0, on the other hand, we have 

[Y + (z)q-(z)} + = [\z~\ N \q + (z)X + {z)] + 

\Y + (z)q.(z)]- = -Y4z)q4z) + [\z-\ N \q + (z)X + (z)}-. (9.111) 

Here the last term in the second equation contains no more than N terms. Be- 
cause of the z~\ N \, we can add any to X + any multiple of Z + (x) = z n [q + (z)]~ 1 
for n — 0, . . . , N— 1, and still have a solution. Thus the solution is not unique. 
Instead, we have dim [Ker (A)} = \N\. 
We have therefore shown that 



Index (A) = dim (Ker A) - dim (Coker A) = -N 



This connection between a topological quantity - in the present case the 
winding number — and the difference in dimension of the kernel and cokernel 
is an example of an index theorem. 

We now need to show that we can indeed factorize A(z) in the desired 
manner. When A(z) is a rational function, the factorization is straightfor- 
ward: if 

A{z) = C ^ {z - an) (9.112) 

ll m ( Z ~ b m) 

we simply take 

( v ri|a n |>o( 1 _ Z / a n) . , 

Q+(z) = n 7--, (9.113) 

where the products are over the linear factors corresponding to poles and 
zeros outside the unit circle, and 

= n M <.(i-w*) 

II|«.,i<o(i-«»A) 
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containing the linear factors corresponding to poles and zeros inside the unit 
circle. The constant A and the power z N in equation (9.106) are the factors 
that we have extracted from the right-hand sides of (9.113) and (9.114), 
respectively, in order to leave l's as the first term in each linear factor. 
More generally, we take the logarithm of 

z~ N A{z) = \q + {z){q_{z))- 1 (9.115) 



to get 



\n[z- N A(z)\ = \n[Xq + (z)} - ln[q_ 



(9.116) 



where we desire ln[Ag + (z)] to be the boundary value of a function analytic 
within the unit circle, and ln[g_(z)] the boundary value of function analytic 
outside the unit circle and with q~(z) tending to unity as \z\ — > oo. The 
factor of z~ N in the logarithm serves to undo the winding of the argument 
of A(z), and results in a single- valued logarithm on the unit circle. Plemelj 
now shows that 



1*1=1 



C-z 



(9.117) 



provides us with the desired factorization. This function Q(z) is everywhere 
analytic except for a branch cut along the unit circle, and its branches, Q + 
within and Q- without the circle, differ by ln^^A^)]. We therefore have 



\q+{z) = 
q_(z) = 



(9.118) 



The expression for Q as an integral shows that Q(z) ~ const./ z as \z\ 
goes to infinity and so guarantees that q~(z) has the desired limit of unity 
there. 

The task of finding this factorization is known as the scalar Riemann- 
Hilbert problem. In effect, we are decomposing the infinite matrix 



A = 



\ 



a 
a_i 
a_ 2 



ai a 2 
a ai 
a_i ao 



(9.119) 
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into the product of an upper triangular matrix 



U = A 



1 qf 
1 




(9.120) 



•••/ 



a lower triangular matrix L, where 



/ 



V 



1 

9-2 





1 






1 



(9.121) 



has l's on the diagonal, and a matrix \ N which which is zero everywhere 
except for a line of l's located iV steps above the main diagonal. The set 
of triangular matrices with unit diagonal form a group, so the inversion 
required to obtain L results in a matrix of the same form. The resulting 

Birkhoff factorization 

A = LA^U, (9.122) 

is an infinite-dimensional extension of the Gauss-Bruhat (or generalized LU) 
decomposition of a matrix. The finite-dimensional Gauss-Bruhat decompo- 
sition provides a factorization of a matrix A e GL(n) as 



A = LIIU, 



(9.123) 



where L is a lower triangular matrix with l's on the diagonal, U is an upper 
triangular matrix with no zero's on the diagonal, and II is a permutation 
matrix, i. e. a matrix that permutes the basis vectors by having one entry of 
1 in each row and in each column, and all other entries zero. Our present \ N 
is playing the role of such a matrix. The matrix II is uniquely determined 
by A. The L and U matrices become unique if L is chosen so that II T LII 
is also lower triangular. 
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9.4.2 Wiener-Hopf Integral Equations 

We now carry over our insights from the simpler sum equations to Weiner- 
Hopf integral equations 

/ K(x-y)(f>(y)dy= f(x), x > 0, (9.124) 
Jo 

by imagining replacing the unit circle by a circle of radius R, and then taking 
R — > oo in such a way that the sums go over to integrals. In this way many 
features are retained: the problem is still solved by factorizing the Fourier 
transform 

/oo 
K(x)e tkx dx (9.125) 
-oo 

of the kernel, and there remains an index theorem 

dim (Ker K) - dim (Coker K) = -N, (9.126) 

but N now counts the winding of the phase of K(k) as k ranges over the real 
axis: 

1 



N — — argK 
2tt 



fc=+oo 

(9.127) 

fc=— oo 



One restriction arises though: we will require K to be of the form 

K(x -y)= 5(x -y) + g(x - y) (9.128) 

for some continuous function g(x). Our discussion is therefore being re- 
stricted to Wiener-Hopf Integral equations of the second kind. 

The restriction comes about about because we will seek to obtain a fac- 
torization of K as 

r(K)K(k) = exp{Q + (k) - Q^{k)} = g + (A;)(g_(A;))- 1 (9.129) 

where q+(k) = exp{Q + (k)} is analytic and non-zero in the upper half /c-plane 
and q~(k) = exp{Q-(/c)} analytic and non-zero in the lower half-plane. The 
factor t(k) is a phase such as 
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which winds —N times and serves serves to undo the +N phase winding in 
K. The Q±(k) will be the boundary values from above and below the real 
axis, respectively, of 

i r^mm dK (9 . 131) 

27TI J.oo K-k 

The convergence of this infinite integral requires that \n[r(K,)K(k)} go to zero 
at infinity, or, in other words, 

lim K(k) = 1. (9.132) 

k— >oo 

This, in turn, requires that the original K(x) contain a delta function. 
Example: We will solve the problem 

PCX) 

<f>{x) - A / e-\ x - y \- a{x - y) <P(y) dy = f(x), x > 0. (9.133) 
Jo 

We require that < a < 1. The upper bound on a is necessary for the 
integral kernel to be bounded. We will also assume for simplicity that A < 
1/2. Following the same strategy as in the sum case, we extend the integral 
equation to the entire range of x by writing 

POO 

(j)(x) - A / e~ lx - yl - a{x - y) (j)(y) dy = f(x) + g(x), (9.134) 
Jo 

where f(x) is nonzero only for x > and g(x) is non-zero only for x < 0. 
The Fourier transform of this equation is 

M k ) = />) +9-(k), (9.135) 

[k + lay + 1 / 

where a 2 = 1 — 2 A and the ± subscripts are to remind us that <j)(k) and f(k) 
are analytic in the upper half-plane, and g(k) in the lower. We will use the 
notation H + for the space of functions analytic in the upper half plane, and 
H for functions analytic in the lower half plane, and so 

Mk), f( + k) E H + , g_(k) E H_ (9.136) 

We can factorize 

~ (k + ia) 2 + a 2 [k + i(a - a)] [k + i(a + a)] 

{ ) (k + ia) 2 + l [k + i(a - 1)) [k + i(a + 1)] 1 } 
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Now suppose that a is small enough that a ± a > and so the numerator 
has two zeros in the lower half plane, and the numerator a one zero in each 
of the upper and lower half-planes. The change of phase in K{k) as we go 
from minus to plus infinity is therefore — 2tt, and so the index is N — — 1. 
We should therefore multiply K by 

rW=g±i)" (M38) 

before seeking to break it into its q± factors. We can however equally well 
take 

( k + i(a — 1) \ , 

T * = 7—^7 ( 9 - 139 

\k + i(a — a) J 

as this also undoes the winding and allows us to factorize with 

(9 - 140) 

The resultant equation analagous to (9.108) is therefore 

k + i(a + a)\~ fk + i(a - 1)\ ~ fk + i(a - 1)\ „ 



k + i(a + 1) / \k + i(a — a) J \k + i(a — a) 

= (rq-)f+ + rq_g_ (9.141) 

The second line of this equation shows the interpretation of the first line in 
terms of the objects in the general theory. The left hand side is in H + - 
i.e. analytic in the upper half-plane. The first term on the right is also in 
H + . (We are lucky. More generally it would have to be decomposed into its 
H± parts.) If it were not for the t(k), the last term would be in but 
it has a potential pole at k = —i(a — a). We therefore remove this pole by 
substracting a term 

g 

k + i(a — a) 

(an element of H + ) from each side of the equation before projecting onto the 
^ parts. After projecting, we find that 

_ / k + i(a + a) \ ~ + _ / k + i(a-l) \ j + _ f3 = ^ 

\ k + i(a + 1) / \k + i(ct — a)J k + i(a — a) 

„_ : + g. - = 0. (9.142) 

\ K + i{a — a) J k + %(pt — a) 
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We solve for (f>+(k) and g-(k) 



(k + ia) 2 + a 2 J \(k + ia) 2 + a 2 

™ = TTiW^Tr (9 ' 143) 

Observe g~(k) is always in H_ because its only singularity is in the upper 
half-plane for any (3. The constant f3 is therefore arbitrary. Finally, we invert 
the Fourier transform, using 

F(9(x)e- ax smhax) = %: -, (a ± a) > 0, (9.144) 

' [k + ia) 2 + a z 



to find that 



2A 

4>{x) = f{x)-— I e- a ^smh a(x-y)f(y)dy 



JO 



+(3' {(a - l) e -^ x + (a + l)e~ {a - a)x ) , (9.145) 

where /3' (proportional to ($) is an arbitrary constant. 

By taking a in the range — 1 < a < with (a ± a) < 0, we make index 
to be N = +1. We will then find there is condition on f(x) for the solution 
to exist. This condition is, of course, that f(x) be orthogonal to the solution 

O ( X ) = {{a- l) e -^ a+a > + (a + l)e^ a - a >} (9.146) 

of the homogenous adjoint problem, this being the f(x) =0 case of the a > 
problem that we have just solved. 



9.5 Further Exercises and Problems 

Exercise 9.18: Contour Integration: Use the calculus of residues to evaluate 
the following integrals: 



,10 

{a+J^se) 2 ' 



f 

h = t — — ;;v>, < b < a. 

Jo 



h = / —. — ^d9, < a < 1. 

Jo 1 ~ 



2a cos 26 + a 2 



roc x a 

h = J q jY^dx, -l<a<2. 
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These are not meant to be easy! You will have to dig for the residues. 
Answers: 

2na 

h 

h 
h 



(O 2 - 6 2 ) 3 /2 ' 

7r(a 3 + l) _ 7r(l-a + a 2 ) 
a 2 — 1 ~ 
7r(l — a) 



a- 1 



4cos(7ra/2) 



Exercise 9.19: By considering the integral of 

f(z) = ln(l - e 2iz ) = \n(-2ie iz sinz) 
around the indented rectangle 



iY 



%+iY 



% 

Figure 9.12: Indented rectangle. 



with vertices 0, tt, ir+iY, iY, and letting Y become large, evaluate the integral 

/"7T 

I = ln(sinx) dx. 
Jo 

Explain how the fact that elne— >0ase^0 allows us to ignore contributions 
from the small indentations. You should also provide justification for any other 
discarded contributions. Take care to make consistent choices of the branch of 
the logarithm, especially if expanding ln(—2ie tx sin x) = ix + In 2 + ln(sin x) + 
ln(— i). The value of / is a real number. 
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Exercise 9.20: By integrating a suitable function around the quadrant con- 
taining the point zq = e l7r / 4 , evaluate the integral 



OO 1 



/(a) = / TT^ dx 0<a<4 - 

(It should only be necessary to consider the residue at Zq.) 

Exercise 9.21: In section ?? we considered the causal Green function for the 
damped harmonic oscillator 

Gftl= Ue^sin(flt), t>0, 
\o, t<0, 

and showed that its Fourier transform 



j — i 



5^G(t) dt = , (9.147) 

-oo n 2 -(uj + i~iY 

had no singularities in the upper half-plane. Use Jordan's lemma to compute 
the inverse Fourier transform 

-iwt 



i r 

2W-, 



2n J_ x + i 7 )2 

and verify that it reproduces G(t). 

Problem 9.22: Jordan's Lemma and one-dimensional scattering theory. In 
problem ??.?? we considered the one-dimensional scattering problem solutions 

(t R (k)e ikx , x £ L, 

\ e ikx + r R (k)e- ikx , x e R. 

and claimed that the bound-state contributions to the completeness relation 
were given in terms of the reflection and transmission coefficients as 



bound 



/oo JL 

—r L (k)e-^ x+x \ x,x'eL, 
-oo Z7T 

[°° ^t L (k)e- zk(x - x,) , xGL, x'€R, 

t R {k)e- iki - x ~ x '\ x £ R, x' £ L, 



I 



oo 

00 dk 



00 dk 



DC 



^r R (k)e- ik( - x+x '\ x,x'eR. 

Z7T 
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The eigenfunctions 
and 



(+)r„\ _ / eikx + r L (k)e ikx , x £ L, 
\t L (k)e ikx , xeR, 



,/.(-) (r ) = (t R (k)e ikx , xeL, 
Vk K ' \ e ikx + r R {k)e~ ikx , x e R. 

are initially refined for k real and positive (ip^) or for k real and negative 

(ipk ■*), but they separately have analytic continuations to all of k G C. The 
reflection and transmission coefficients rL, R {k) and tL,R(k) are also analytic 
functions of k, and obey r LjR (k) = r* L R (-k*), t LjR (k) = t* L R {-k*). 

a) By inspecting the formulas for ip^ (x) , show that the bound states ip n (x) , 



with E n = —k^, are proportional to ip^~\x) evaluated at points k 



on the positive imaginary axis at which ri,(k) and ti(k) simultaneously 
have poles. Similarly show that these same bound states are proportional 
to \x) evaluated at points — in n on the negative imaginary axis at 
which r R (k) and t R (k) have poles. (All these functions ipj^\x), r Rt i(k), 
tR,L(k), may have branch points and other singularities in the half-plane 
on the opposite side of the real axis from the bound-state poles.) 
b) Use Jordan's lemma to evaluate the Fourier transforms given above in 
terms of the position and residues of the bound-state poles. Confirm 
that your answers are of the form 

^A;[sgn(x)]e- K ^U„[sgn(*0]e- KnM , 

n 

as you would expect for the bound-state contribution to the completeness 
relation. 

Exercise 9.23: Lattice Matsubara sums: Show that sums over the N-th roots 
of —1 can be written as an integral 

where C consists of a pair of oppositely oriented concentric circles. The annu- 
lus formed by the circles should include all the roots of unity, but exclude all 
singularites of /. Use this trick to show that, for N even, 

1 ^ sinh£ 1 , NE 

— > 7^ — rr— = tanh . 

N ^ sinh 2 E + sin 2 cosh E 2 
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Take the N — > oo limit in some suitable manner, and hence show that 

Ea l a 

a 2 + [(2n + l)7r] 2 = 2 tanh 2' 

n=— oo Lv ' J 

(Hint: If you are careless, you will end up differing by a factor of two from this 
last formula. There are two regions in the finite sum that tend to the infinite 
sum in the large N limit.) 

Problem 9.24: If we define x{h) = e ax (j)(x), and F(x) = e ax f(x), then the 
Wiener-Hopf equation 

POO 

<j){x) - A / e -\ x - y \- a( ~ x - v U(y) dy = f(x), x > 0. 
Jo 

becomes 

poo 

X(x)-X e-\ x -y\ X (y)dy = F(x), x > 0, 
Jo 

all mention of a having disappeared! Why then does our answer, worked out 
in such detail, in section 9.4.2 depend on the parameter a? Show that if a 
small enough that a + a is positive and a — a is negative, then <f>{x) really is 
independent of a. (Hint: What tacit assumptions about function spaces does 
our use of Fourier transforms entail? How does the inverse Fourier transform 
of [(k + ia) 2 + a 2 ] -1 vary with a?) 



Chapter 10 
Special Functions II 



In this chapter we will apply complex analytic methods so as to obtain a 
wider view of some of the special functions of mathematical physics than can 
be obtained on the real axis. The standard text in this field remains the 
venerable Course of Modern Analysis of E. T. Whittaker and G. N. Watson. 

10.1 The Gamma Function 

We begin with Euler's "Gamma Function" T{z). You probably have some 
acquaintance with this creature. The usual definition is 

PCX) 

F(z)= / t* -1 e _t d*, Rez>0, (definition A). (10.1) 
Jo 

An integration by parts, based on 

j (t 2 e -*) = zt z - x e- 1 - t z e~\ (10.2) 



shows that 

fOO POO 



✓•oo ;>oo 

\t*e- t ]™ = zl f-^dt- t z e- l dt. 
Jo Jo 



;io.3) 



The integrated out part vanishes at both limits, provided the real part of z 
is greater than zero. Thus 

T(z + 1) = zT(z). (10.4) 
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Since T(l) = 1, we deduce that 

T(n) = (n-1)!, n= 1,2,3, (10.5) 

We can use the recurrence relation to extend the definition of T(z) to the left 
half plane, where the real part of z is negative. Choosing an integer n such 
that the real part of z + n is positive, we write 

r W = - r(' + "> -. (io.6) 

w z{z + 1) ■ ■ -{z + n -1) v ' 

We see that F(z) has poles at zero, and at the negative integers. The residue 
of the pole at z — — n is (— l) n /n\. 

We can also view the analytic continuation as an example of Taylor series 
subtraction. Let us recall how this works. Suppose that — 1 < Rex < 0. 
Then, from 

^(Fe-*) = xt^e^ - t x e- 1 (10.7) 

we have 

/oo /»oo 
dtt x - x e- 1 -^ dtt x e- 1 . (10.8) 

Here we have cut off the integral at the lower limit so as to avoid the di- 
vergence near t — 0. Evaluating the left-hand side and dividing by x we 
find 

1 poo 1 POO 

e x = / dtf^e-*-- dtt x e- 1 . (10.9) 

Since, for this range of x, 

— e x = / dtt x ~\ (10.10) 

we can rewrite (10.9) as 

-I poo /*oo 

-J dtt x e- l = J dtt*- 1 (e - * - 1) . (10.11) 

The integral on the right-hand side of this last expression is convergent as 
e — > 0, so we may safely take the limit and find 

1 f°° 

-T(x+1)= dtt x - x (e-*- 1) . (10.12) 
x Jo 
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Since the left-hand side is equal to T(x), we have shown that 

POO 

T(x)= cftr -1 (e~* - 1) , -l<Rea;<0. (10.13) 
Jo 

Similarly, if —2 < Rex < —1, we can show that 

POO 

T(x)= dtt x ~ l (e~* - 1 +t) . (10.14) 
Jo 

Thus the analytic continuation of the original integral is given by a new 
integral in which we have subtracted exactly as many terms from the Taylor 
expansion of e~* as are needed to just make the integral convergent. 

Other useful identities, usually proved by elementary real-variable meth- 
ods, include Euler's "Beta function" identity, 



B(a,b) — r '°W 



/ (i -ty-H^dt (io.i5) 

Jo 



T(a + b) 

(which, as the Veneziano formula, was the original inspiration for string 
theory) and 

r(z)F(l -z)= 7TCOSeC7T2. (10.16) 
The proofs of both formulae begin in the same way: set t = y 2 , x 2 , so that 

/»oo /»oo 

r(a)r(6) = 4 / y 2a - l e- y2 dy / x^e'^ dx 
Jo Jo 

noo 
e -^ W ) x 2 b -l y 2a-l dxdy 

poo P 7 ^/^ 

= 2 / e- r ' 2 \r 2 ) a + b - 1 d{r 2 ) / sin 2 "' 1 9 cos 26 " 1 9 dd. 
Jo Jo 

We have appealed to Fubini's theorem twice: once to turn a product of 
integrals into a double integral, and once (after setting x = rcos6, y = 
r sin 9) to turn the double integral back into a product of decoupled integrals. 
In the second factor of the third line we can now change variables to t = sin 2 9 
and obtain the Beta function identity. If, on the other hand, we put a = 1 — z, 
b = z we have 

poo /* 7r /2 r 71 /^ 

r(z)r(l - z) = 2 / e~ r2 d(r 2 ) cot 22 " 1 9 d9 = 2 cot 22 " 1 9 d6. 
Jo Jo Jo 

(10.17) 
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Now set cot9 = (. The last integral then becomes (see exercise 9.1): 

roc q2z-1 

-5 d( = 7rcosec tcz, < z < 1. (10.18) 

C 2 + 1 

Although this integral has a restriction on the range of z, the result (10.16) 
can be analytically continued to so as to hold for all z . If we put z — 1/2 
we find that (r(l/2)) 2 = ir. The positive square root is the correct one, and 

T(l/2) = v^F. (10.19) 

The integral in definition A is only convergent for Re 2; > 0. A more 
powerful definition, involving an integral which converges for all z, is 

' / 6 dt. (definition B) (10.20) 



T(z) 2ni J c t 



C 



Re(t) 



Im(t) 



Figure 10.1: Definition "B" contour for T (z) . 



Here C is a contour originating at z = —00 — ie, below the negative real axis 
(on which a cut serves to make t~ z single valued) rounding the origin, and 
then heading back to z = — 00 + ie — this time staying above the cut. We 
take argt to be +7r immediately above the cut, and — ir immediately below 
it. This new definition is due to Hankel. 

For z an integer, the cut is ineffective and we can close the contour to 
find 

' 0; t^ = - — ^tt, n>0. (10.21) 



r(o) 



n 



in 



r 
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Thus definitions A and B agree on the integers. It is less obvious that they 
agree for all z. A hint that this is true stems integrating by parts 



r(z) 



1 

2tH 



(z - l)f*-i 



— oo+ie 



1 f e 

ie + (z- 1)2*1 Jc f*" 1 ^ ~ 



{z-\)v{z-iy 

(10.22) 

The integrated out part vanishes because e* is zero at — oo. Thus the "new" 
gamma function obeys the same functional relation as the "old" one. 

To show the equivalence in general we will examine the definition B ex- 
pression for T(l — z) 



V(l-z) 



1 

2vri 



eV" 1 dt. 



(10.23) 



c 



We will asume initially that Kez > 0, so that there is no contribution from 
the small circle about the origin. We can therefore focus on contribution 
from the discontinuity across the cut 



1 



ri-z) 



1 

1 

7T 



eH z ~ x dt 



c 

sin ix z 



1 f°° 

(2isin7r(z- 1)) / t z ~\ 

2m Jo 



"* dt 



t 



z - x e- 1 dt. 



(10.24) 



The proof is then completed by using T(z)T(l — z) — 7rcosec nz, which we 
proved using definition A, to show that, under definition A, the right hand 
side is indeed equal to 1/T(1 — z). We now use the uniqueness of analytic 
continuation, noting that if two analytic functions agree on the region Re z > 
0, then they agree everywhere. 



Infinite Product for F(z) 

The function T(z) has poles at z — 0,-1,-2,... therefore (zT(z))^ 1 = 
{T{z + 1)P has zeros as z — — 1, —2, .... Furthermore the integral in "defi- 
nition B" converges for all z, and so 1/F(z) has no singularities in the finite 
z plane i. e. it is an entire function. Thus means that we can use the infinite 
product formula 

g(z)=g(0)e"f[{(l-^)e!'''>} (10.25) 
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for entire functions. 

We need to recall the definition of Euler-Mascheroni constant 7 = — r'(l) = 
.5772157. . ., and that T(l) = 1. Then 



We can use this formula to compute 



1 00 



Y(z)V(\-z) ~ (-z)T(z)T(-z) 



*n{(i+^)«-^(i-s)^} 



1 



1 
1 

= — sin 71 z 

7T 

and so obtain another demonstration that r(z)r(l — z) — 7rcosec7r,2. 
Exercise 10.1: Starting from the infinite product formula for T(z), show that 

^-lnr(z) = V- 
cfe 2 ^ (z + n) 2 

(Compare this "half series" , with the expansion 



> \ ^ 1 

7T COSeC TTZ = > , 

^ (z + n) z 



10.2 Linear Differential Equations 

When a linear differential equation has meromorphic coeffecients, its solu- 
tions can be extended off the real line and into the complex plane. The 
broader horizon then allows us to see much more of their structure. 



10.2.1 Monodromy 

Consider the linear differential equation 

Ly = y" + p(z)y' + q(z)y = 0, (10.27) 
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where p and q are meromorphic. Recall that the point z = a is a regular 
singular point of the equation if p or q is singular there, but 

(z-a)p(z), (z-afq(z) (10.28) 

are both analytic at z — a. We know, from the explicit construction of power 
series solutions, that near a regular singular point y is a sum of functions of 
the form y — [z — a) a ip(z) or y = (z — a) a (\n(z — a)<p(z) + x( z ))i where both 
ip(z) and x( z ) are analytic near z — a. We now examine this fact in a more 
topological way. 

Suppose that y\ and y 2 are linearly independent solutions of Ly = 0. Start 
from some ordinary (non-singular) point of the equation and analytically 
continue the solutions round the singularity at z = a and back to the starting 
point. The continued functions y\ and i/2 will not in general coincide with 
the original solutions, but being still solutions of the equation, must be linear 
combinations of them. Therefore 

*}) = ( ai1 ai2 )( Vl ), (10.29) 

2/2 J V °21 °22 / V / 

for some constants a^. By a suitable redefinition of the yi we may either 
diagonalise this monodromy matrix to find 

IWo («»») 



(10.31) 



or, if the eigenvalues coincide and the matrix is not diagonalizable, reduce it 
to a Jordan form 

m\ = fx i\ f yi 
m) \o \) \y 2 . 

These equations are satisfied, in the diagonalizable case, by functions of the 
form 

yi = (z- a) a Vi(*), V2 = (z- a) a ^ 2 (z), (10.32) 

where = e 2moik , and (pk{ z ) is single valued near z — a. In the Jordan- form 
case we must have 



y 1 = {z-a) a 



<Pi{ z ) + ^jj- H z - a)<P2(z) 



y 2 = {z-a) a V2 {z), (10.33) 



where again the ipk(z) are single valued. Notice that coincidence of the 
monodromy eigenvalues Ai and A2 does not require the exponents oti and a 2 
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to be the same, only that they differ by an integer. This is the same condition 
that signals the presence of a logarithm in the traditional series solution. 

The occurrence of fractional powers and logarithms in solutions near a 
regular singular point is therefore quite natural. 



10.2.2 Hypergeometric Functions 

Most of the special functions of Mathematical Physics are special cases of 
the hypergeometric function F(a, b; c; z), which may be defined by the series 

r./ , x 1 a -b a(a+ 1)6(6+1) 2 

a(a+ l)(a + 2)6(6+ 1)(6 + 2) 3 
+ 3!c(c+l)(c + 2) Z + '"' 

r( c ) ^r(a + n)r(6 + n) B 



r(a)r(6) Y r ( c + n ) r ( 1 + ^) 



For general values of a,b,c, this series converges for \z\ < 1, the singularity 

restricting the convergence being a branch point at z — 1. 

Examples: 



(l + z) n 


= F(-n,b;b;-z), 


(10.35) 


ln(l + z) 


= zF(l,l;2;-z), 


(10.36) 


_1 sin -1 2; 


\2'2'2' /' 


(10.37) 


e z 


= lim F(l,6;l/6;z/6), 


(10.38) 


Pn(z) 


= F (—n, n + 1; 1; ~~^~~J > 


(10.39) 



where in the last line P n is the Legendre polynomial. 

For future reference, note that expanding the right hand side as a powers 
series in z and integrating term by term shows that 

F(a, b; c; z) = r(&) ^_ j\l - tz)- a t b -\l - t) M dt. (10.40) 

If Re c > Re (a + b), we may set z — 1 in this integral to get 

F (o , 6; e;l) = y r < C -"-I;' (10.4!) 
r(c — a)T(c — 6) 



10.2. LINEAR DIFFERENTIAL EQUATIONS 



409 



The hypergeometric function is a solution of the second-order differential 
equation 

z(l - z)y" + [c - (a + b + l)z\y' - aby = 0. (10.42) 

this equation has regular singular points at z — 0, 1, oo. Provided that 1 — c 
is not an integer, the general solution is 

y = AF(a, b; c; z) + Bz x - c F(b - c + 1, a - c + 1; 2 - c; z). (10.43) 

The hypergeometric equation is a particular case of the general Fuchsian 
equation having three 1 regular singularities at z = Zi,z 2 ,z 3 . This equation is 

y" + P(z)y' + Q(z)y = 0, (10.44) 

where 

P(z) = ( \z2z± + IzlzL + izlzV 

V -2 - -21 - Z 2 Z - Z 3 

® Z (z - z 1 )(z - z 2 ){z - z 3 ) X 

( Zl - z 2 )( Zl - z 3 )aa' , ( Z2 -z 3 )( Z2 - Zl )ppi (z 3 - Zl )(z 3 - z 2 )^' 



Z — Z\ z — z 2 z — z 3 

(10.45) 

The parameters are subject to the constraint a + /3 + j + a' + /3' + = 1, 
which ensures that z = oo is not a singular point of the equation. This 



lr rhe Fuchsian equation with two regular singularities is 

y" + p(z)y' + q(z)y = 

with 

, / 1 — a — a' 1 + a + a' 

p{z) = 1 

V z — zi z- z 2 

aa'(z 1 - z 2 f 



(z - zi) 2 (z - z 2 ) 2 ' 

Its general solution is 

y = A(^Y + B fz - ' 



Z — Z 2 I \Z — Z2 
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equation is sometimes called Riemann's F '-equation. The P probably stands 
for Papperitz, who discovered it. 

The indicial equation relative to the regular singular point at z\ is 

r(r - 1) + (1 -a -a')r + aa' = 0, (10.46) 

and has roots r = a, a'. From this we deduce that Riemann's equation 
has solutions which behave like (z — Zi) a and (z — Z\) a near z\. Similarly, 
there are solutions that behave like (z — z 2 Y and (z — z 2 )@ near z 2 , and like 
(z — Z3) 1 and [z — z 3 )"< near z 3 . The solution space of Riemann's equation is 
traditionally denoted by the Riemann "P" symbol 

{Zl z 2 z 3 "I 
a 13 7 z \ (10.47) 
a' (3> 7' J 

where the six quantities a, /3, 7, a', f3', 7', are called the exponents of the so- 
lution. A particular solution is 

(z - zA a (z - z 3 y ( , , (z - zi)(z 3 - z 2 )\ 

\z-z 2 ) \z-z 2 ) \ (z- z 2 )(z 3 - z x ) ) 

(10.48) 

By permuting the triples (zi, a, a'), (z 2 ,j3,j3'), (2:3,7, 7'), and within them 
interchanging the pairs a <-> a', 7 <-> 7', we may find a total 2 of 6 x 4 = 24 
solutions of this form. They are called the Kummer solutions. Only two of 
these can be linearly independent, and a large part of the theory of special 
functions is devoted to obtaining the linear relations between them. 
It is straightforward, but a trifle tedious, to show that 

{zi z 2 z 3 j ( Zi z 2 z 3 

a 13 7 z\=Pla + r (3 + s 7 + t z > 
a' 13' i J [a' + r P' + s i + t 

(10.49) 

provided r + s + t = 0. Riemann's equation retains its form under Mobius 
maps, only the location of the singular points changing. We therefore deduce 
that 

{ z l z 2 z 3 j I z[ z 2 z 3 j 

a (3 7 z ) = P< a (3 7 z' \ (10.50) 
ot P' i J U' P' i J 



2 The interchange (3 <-> /?' leaves the hypergeometric function invariant, and so does not 
give a new solution. 
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where 

, az + b , + 6 , az 2 + b az 3 + b /, nr ,>. 

2 = :, 2n = :, 2n = h Z 3 = (10.51) 

cz + a cz\ + a cz 2 + a cz 3 + a 

By using the Mobius map which takes (21,22,23) —> (0,1, 00), and by 
extracting powers to shift the exponents, we can reduce the general eight- 
parameter Riemann equation to the three-parameter hypergeometric equa- 
tion. 

The P symbol for the hypergeometric equation is 

00 1 

F(a,b;c;z) = P { a z). (10.52) 

1 — c b c — a — b 

Using this observation and a suitable Mobius map we see that 

F(a, b; a + b — c; 1 — 2) 

and 

(1 - zf- a - b F{c - b, c - a; c - a - b + 1; 1 - 2) 

are also solutions of the Hypergeometric equation, each having a pure (as 
opposed to a linear combination of) power-law behaviors near 2 = 1. (The 
previous solutions had pure power-law behaviours near z=0.) These new 
solutions must be linear combinations of the old, and we may use 

F(a,6;c;l) = ^ C)r(0 ~ a ~^ , Re (c - a - b) > 0, (10.53) 
r(c — a)T(c — 0) 

together with the trick of substituting 2 = and 2 = 1, to determine the 
coefficients and show that 

^/ , \ T(c)T(c — a — b) ^ . . 

F(a,b;c;z) = v -(F a, b; a + b - c; 1 - 2 

r(c — a)r(c — 0) 

r( c )r(a + 6-c) /i , c _ a _ 6 



(l_,,)c-a- 6F(c _ 6 fl _ 6+1 !_ 

r(a)r(6) 

(10 



This last equation holds for all values of a, b, c such that the gamma functions 
make sense. 
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A complete set of pure-power solutions can be taken to be 



4 0) (^ 

(z 



F(a, b; c; z) 

z l ~ c F(a + 1 - c, b + 1 - c; 2 - c; z) 

F(a, b; 1 — c + a + 6; 1 — z) 

(1 - z) c - a - b F(c - a, c - b; 1 + c - a - b; 1 - z) 

z~ a F(a ) a + 1 - c; 1 + a - 6; z -1 ) 

z~ b F(a, b+l-c;l-a + b; z' 1 ), 



(10.55) 



The connection coefficients are then 



4 0) = 



T(c)T(c-a-b) (o) r(c)r(a + 6-c) (i) 

r(c-a)r( c -6) 91 r(a)r(6) ^ ' 

T(2 - c)r(c - a - 6) (o) T(2 - c)r(a + b - c) 



r(i-a)r(i-6) Yl r(a + i-c)r(6+i-c) 



(10.56) 



and 



(0) 



, r(c)r(6-a) 
r(c-a)r(6) 



—iwb 



r(2-c)r(a-6) 



(i) 



r(a + i-c)r(i-6) 

} r(2-c)r(a-6) 



-^(g+l-c) r ( 2 - c ) r ( & - a ) , (0) , 

r(6+i-c)r(i-a)^°° r(a + i-c)r(i-6) 



(10.57) 



These relations assume that lm.z > 0. The signs in the exponential factors 
must be reversed when Imz < 0. 

Example: The Poschel-Teller problem for general positive I. A substitution 
z = (1 + e 2x )~ l shows that the Poschel-Teller Schrodinger equation 



d 2 



dx 2 



- 1(1 + l)sech 2 :r ) = Eip 



(10.58) 



has solution 

ip(x) = (1 + e 2x )- R/2 {\ + e - 2 *)-«/ 2 F ( k + I + 1, re - /; k + 1; - 1 = ) 

\ 1 + e 2x J 



(10.59) 
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where E = —k 2 . This solution behaves near x = oo as 

rj} rsj e ~ KX F(K + 1 + 1,k-1;k+;0) = e'™. (10.60) 

We use the connection formula (10.54) to see that it behaves in the vicinity 

of x = — oo as 

i) ~ e KX F{n + I + 1, k - Z; k + 1; 1 - e 2x ) 

r(« + i)r(-«) r(« + i)r(«) 

r(-/)r(i + /) + r(« + z + i)r(«-z)" 1 J 

To find the bound-state spectrum, assume that k is positive. Then 
E = —k 2 will be an eigenvalue provided that coefficient of e~ KX near x = — oo 
vanishes. In other words, when 

r^ + W*) -o. (10 . 62) 



r(K + z + i)r(«-z) 

This condition is satisfied for a finite set /c n , n — 1, . . . , [/] (where [Z] denotes 
the integer part of Z) at which k is positive but /c — / is zero or a negative 
integer. 

On setting k = —ik, we find the scattering solution 



where 



r(k) 



K \ _ j e lkx + r(k)e lkx x <C 0, 
^ W ~ \ t(A;)e ifc:E x > 0, 



r(Z + 1 - ik)T(-ik - l)T(ik) 
Y(-l)Y(l + l)T{ik) ' 
_ sin vrZ Y{1 + 1 - ik)V{-ik - l)T{ik) 

7r r(— ifc) 



(10.63) 



(10.64) 



and 



v ; r(i - ik)r(-ik) v ; 

Whenever Z is a (positive) integer, the divergent factor of T(— Z) in the de- 
nominator of r(k) causes the the reflected wave to vanish. This is something 
we had discovered in earlier chapters. In this particular case the transmission 
coefficient t(k) reduces to a phase 

t(k) ~ (-ik + l)(-ik + 2)...(-ik + l) 

m ~ (-ik-i)(-ik-2)...(-ik-iy { j 
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10.3 Solving ODE's via Contour integrals 

Our task in this section is to understand the origin of contour integral solu- 
tions such as the expression 

F(a, b; c; z) = T{b ^_ b) j\ ~ tz)^\\ - t) M dt, (10.67) 

we have previously seen for the hypergeometric equation. 
We are given a differential operator 

L z = d 2 zz +p(z)d z + q(z) (10.68) 

and seek a solution of L z u = as an integral 

u(z) = J F(z,t)dt. (10.69) 



If we can find an F such that 

T P = 

for some function Q(z,t) then 



dQ 

L Z F = (10.70) 



L z u = jf L z F(z, t)dt = £ (jjpj dt = [Q] r . (10.71) 

Thus, if Q vanishes at both ends of the contour, if it takes the same value at 
the two ends, or if the contour is closed and has no ends, we have succeeded 
in our quest. 

Example: Consider Legendre's equation 

r> . d Uj d%ii , , , _ . 

L z u = (1 - z 2 )— - 2z— + v[v + l)u = 0. (10.72) 



The identity 



(t- z y+ 1 j v 1 dt \ (t- z y+ 2 



shows that 

P»(*) = 7Lf \ S~!L \dt (10.71) 



2m J r {2 u (t - z) u+1 
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will be a solution of Legendre's equation provided that 

\t 2 -iy +r 



[Qh 



o. 



(10.75) 



J r 



We could, for example, take a contour that circles the points t = z and t — 1, 
but excludes the point t — — 1. On going round this contour, the numerator 
aquires a phase of e 2m ( u + l ) ^ while the denominator of [Q}r aquires a phase of 



d 2tti(v+2) 



The net phase change is therefore e 



1. The function in the 



integrated-out part is therefore single-valued, and so the integrated-out part 
vanishes. When v is an integer, Cauchy's formula shows that 

1 d n 



Pn(z) 



-(*•-!) 



2 n n\ dz r ' 

which is Rodriguez' formula for the Legendre polynomials. 



(10.76) 



* 1 


I 


I 

•z 


1 f H. 


V _1 / — 


— 1 / 



~Re(t) 



Figure 10.2: Figure-of-eight contour for Q V (Z). 
The figure-of-eight contour shown in figure 10.2 gives us another solution 



Qu(z) 



1 



Ai sin 7T v 



(t 2 



1) 



2 v (z-ty+ 1 



dt, vt£Z. 



(10.77) 



Here we define arg(t — 1) and arg(t — 1) to be zero for t > 1. The integrated 
out part vanishes because the phase gained by the (t 2 — l) u+1 in the numerator 
of [Q]r during the clockwise winding about t — 1 is undone during the anti- 
clockwise winding about t = —1, and, provided that z is outside the contour, 
there is no phase change in the [z — t)~( u+2 ^ in the denominator. 

When v is real and positive the contributions from the circular arcs sur- 
rounding t — ±1 become negligeable as we shrink this new contour down 
onto the real axis. After this manouvre the integral (10.77) becomes 
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u 



In contrast to (10.77), this last formula continues to make sense when v 
is a positive integer, and so provides a convenient definition of Q n (z), the 
Legendre function of the second kind (See exercise 9.3). 

It is hard to find a suitable F(z, t) in one fell swoop. (The identity (10.73) 
exploited in the example is not exactly obvious!) An easier strategy is to seek 
solution in the form of an integral operator with kernel K acting on function 
v(t). Thus we set 

( z ) = [ K(z,t)v(t)dt. (10.79) 

J a 

Suppose that L z K(z,t) = M t K(z,t), where M t is differential operator in t 
that does not involve z. The operator M t will have have a formal adjoint M.\ 
such that 

pb pb 

/ v(M t K)dt- K( y M}v)dt=[Q( y K,v)] b a . (10.80) 

J a J a 

(This is Lagrange's identity.) Now 

L z u = / L z K(z,t)v dt 

J a 

= j {M t K(z,t))vdt 

J a 

= f K(z,t)(M}v)dt+[Q(K,v)] b a . 

J a 

We can therefore solve the original equation, L z u = 0, by finding a v such 
that {M}v) = 0, and a contour with endpoints such that [Q(K, v)] b a = 0. 
This may sound complicated, but an artful choice of K can make it much 
simpler than solving the original problem. 
Example: We will solve 

d 2 u du n /, « „, x 

L z u = — -z— + uu = 0, (10.81) 

by using the kernel K(z,t) = e~ zt . We have L z K(z,t) = M t K(z,t) where 

M t = t 2 - t^- + is, (10.82) 
at 

so 

M} =t 2 + — t + u = t 2 + {u+l)+t— . (10.83) 
dt dt 
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The equation M\v = has solution 

v (t) =i-(" +1 )e-5* 2 , (10.84) 

and so 



u 

for some suitable T. 



Jt-^e-^+^dt, (10.85) 



10.3.1 Bessel Functions 

As an illustration of the general method we will explore the theory of Bessel 
functions. Bessel functions are member of the family of confluent hypergeo- 
metric functions, obtained by letting the two regular singular points zi, z% of 
the Riemann-Papperitz equation coalesce at infinity. The resulting singular 
point is no longer regular, and confluent hypergeometric functions have an 
essential singularity at infinity. The confluent hypergeometric equation is 

zy" + (c - z)y - ay = 0, (10.86) 

with solution 



v ' ' ; T a c + n r n+1 v ; 

The second solution, when c is not an integer, is 

z 1 - c $(a-c+ 1,2 -c;z). (10.88) 

We see that 

$(a, c; 2) = lim F(a, 6; c; z/b). (10.89) 

Other functions of this family are the parabolic cylinder functions, which 
in special cases reduce to e~ z / 4 times the Hermite polynomials, the error 
function 

erf (*) = jf e-* 2 dt = z<S> Q, | -z 2 ) (10.90) 
and the Laguerre polynomials 



(10.91) 
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Bessel's equation involves 



oi + h + (i - £ 



Experience shows that a useful kernel is 

K( Z ,t): 



(10.92) 



(10.93) 



Then 



L z K(z,t)= [d t -^±)K(z,t) 



(10.94) 



so M is a first order operator, which is simpler to deal with than the original 
second order L z . In this case 



M f = ( -a 



(10.95) 



and we need a t> such that 



lA f r =-{dt+ 1 '' = °- 



(10.96) 



Clearly v — t v will work. The integrated out part is 



[Q(K,v)] b a 



(10.97) 



We see that 



dt. 



10.98) 



solves Bessel's equation provided we use a suitable contour. 

We can take for C a contour starting at —oo — ie and ending at — oo + ie, 
and surrounding the branch cut of t~ v ~ x , which we take as the negative t 
axis. 
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Re(t) 



Im(t) 



Figure 10.3: Contour for solving Bessel equation. 



This contour works because Q is zero at both ends of the contour. 
A cosmetic rewrite t = uz/2 gives 

J v (z) = — [ u -"-M( u -«) du. (10.99) 
2m J c 

For v an integer, there is no discontinuity across the cut, so we can ignore it 
and take C to be the unit circle. Then, recognizing the resulting 

J n {z) = — [ u - n - l e^ u --) du. (10.100) 
2?r« J\ z \=i 

to be a Laurent coefficient, we obtain the familiar generating function 

oo 

e§("-£) = Jn(z)u n . (10.101) 

— oo 

When v is not an integer, we see why we need a branch cut integral. 
If we set u = e w we get 

J v {z) = — [ dwe zsinhw ~ uw , (10.102) 
2m J c , 

where C starts goes from oo — m to —im to +m to oo + in. 
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\ Im(w) 










► 


Re(w) 


-in 







Figure 10.4: Bessel contour after change of variables. 



If we set w = t ± iir on the horizontals and w = i9 on the vertical part, 
we can rewrite this as 



Ju(z) 



1 

7T 



cos(v6 — z sin 9) d9 — 



Sin V1X 



7T 



-vt—z sinht 



dt. 



10.103) 



All these are standard formulae for the Bessel function whose origin would 
be hard to understand without the contour solutions trick. 

When v becomes an integer, the functions J v {z) and J- V {z) are no longer 
independent. In order to have a Bessel equation solution that retains its 
independence from J u (z), even as v becomes a whole number, we define the 
Neumann function 



dcf J V {Z) COSZ/7T - J- V {z) 

N u (z) = 



sin vtx 

cot vix 



/ cos(u9 — z sin 9) d6 — cosec vixix \ cos(u9 + zsm6) d6 

Jo Jo 

ms7/7r r°° i r°° 

" ' e~ ut - zsinht dt- - e ut - zsinht dt. (10.104) 



TT In 7T 
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Figure 10.5: Contours defining H v l \z) and H^'(z). 



r(2), 



Both Bessel and Neumann functions are real for positive real x. As x becomes 
large they oscillate as slowly decaying sines and cosines. It is sometimes 
convenient to decompose these real functions into solutions that behave as 
e ±lx . We therefore define the Hankel functions by 



Then 



1 



I TV 



00+17T 



e 

> 

j poo — in 



z sinh w—uw 



dw, |arg2;| < n/2 



in 



z sinh id— vw 



dw, 



\wgz\ < tt/2. (10.105) 



-{Hg-\z)+H®{z)) = J v (z), 
\{H£\z)-H?\z)) = N v {z). 



(10.106) 



10.4 Asymptotic Expansions 

We often need the understand the behaviour of solutions of differential equa- 
tions and functions, such as J u (x), when x takes values that are very large, 
or very small. This is the subject of asymptotics. 
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As an introduction to this art, consider the function 

POO 

Z(X) = / e - x2 ~ Xx4 dx. (10.107) 



Those of you who have taken a course quantum field theory based on path 
integrals will recognize that this is a "toy," 0- dimensional, version of the path 
integral for the Xp 4 model of a self-interacting scalar field. Suppose we wish 
to obtain the perturbation expansion for Z(X) as a power series in A. We 
naturally proceed as follows 



oo 



Z(X) = I e- x '- Xx4 dx 



— oo 

oo * X II ,.1/1 



/ e 

J — oo 
oo 



n=0 



^ \n roo 

y^(_i)"^_ / e - x2 x An dx 



= Zl( _1 ) n_ r r ( 2n + 1 / 2 )- (10.108) 

n=0 

Something has clearly gone wrong here! The gamma function T(2n+ 1/2) ~ 
(2n)! ~ 4 n (n!) 2 overwhelms the n\ in the denominator and the radius of 
convergence of the final power series is zero. 

The invalid, but popular, manoeuvre is the interchange of the order of 
performing the integral and the sum. This interchange cannot be justified 
because the sum inside the integral does not converge uniformly on the do- 
main of integration. Does this mean that the series is useless? It had better 
not! All quantum field theory (and most quantum mechanics) perturbation 
theory relies on versions of this manoeuvre. 

We are saved to some (often adequate) degree because, while the inter- 
change of integral and sum does not lead to a convergent series, it does lead 
to a valid asymptotic expansion. We write 

^(A) ~ ^(-l) n — r(2n + 1/2) (10.109) 

n=0 

where 

oo 

Z{X)^^a n X n (10.110) 

n=0 
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is shorthand for the more explicit 

N 

Z{X) = J2 a n^ n + 0(X N+1 ), N= 1,2,3,.... (10.111) 

?1=0 

The "big O" notation 

N 

Z{\) - a n\ n = 0(\ N+1 ) (10.112) 

n=0 

as A — > 0, means that 

f |Z(A) - E^flnA n | , 
lim <^ 1 v ; f" n 1 } = K < oo. (10.113) 



l im o\ |A™ 



The basic idea is that, given a convergent power series J2 n a n \ n for the 
function /(A), we fix the value of A and take more and more terms. The sum 
then gets closer to /(A). Given an asymptotic expansion, on the other hand, 
we select a fixed number of terms in the series and then make A smaller and 
smaller. The graph of /(A) and the graph of our polynomial approximation 
then approach each other. The more terms we take the sooner they get close, 
but for any non-zero A we can never get exacty /(A) — no matter how many 
terms we take. 

We often consider asymptotic expansions where the independent variable 
becomes large. Here we have expansions in inverse powers of x: 

N 

F{x) =Y,b n x~ n + 0(x~ N ~ 1 ) , N = 1,2,3.... (10.114) 

n=0 

In this case 

N 

F{x)-J2 b nX~ n = 0(x~ N - 1 ) (10.115) 



n=0 



means that 



[ ilu <| \ F ( x )^o^ \ y h <: x , i : , i 1G) 



x~ >oo \X 



Again we take a fixed number of terms, and as x becomes large the function 
and its approximation get closer. 
Observations: 
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i) Knowledge of the asymptotic expansion gives us useful knowledge about 
the function, but does not give us everything. In particular, two distinct 
functions may have the same asymptotic expansion. For example, for 
small positive A, the functions F(X) and F(X) + ae~ b / x have exactly the 
same asymptotic expansions as series in positive powers of A. This is 
because e~ h l x goes to zero faster than any power of A, and so its asymp- 
totic expansion a n X n has every coefficient a n being zero. Physicists 
commonly say that e~ b ^ x is a non-perturbative function, meaning that 
it will not be visible to a perturbation expansion in powers of A. 

ii) An asymptotic expansion is usually valid only in a sector a < arg z < b. 
Different sectors have different expansions. This is called the Stokes' 
phenomenon. 

The most useful methods for obtaining asymptotic expansions require 
that the function to be expanded be given in terms of an integral. This 
is the reason why we have stressed the contour integral method of solving 
differential equations. If the integral can be approximated by a Gaussian, we 
are lead to the method of steepest descents. This technique is best explained 
by means of examples. 

10.4.1 Stirling's Approximation for n\ 

We start from the integral representation of the Gamma function 

poo 

T(z + 1) = / e~H z dt (10.117) 
Jo 

Set t = z(, so 

/•oo 

T(z + l) = z z+1 e zm d(, (10.118) 
Jo 

where 

/(C)=lnC-C- (10.119) 

We are going to be interested in evaluating this integral in the limit that 
|2:| — ^ oo and finding the first term in the asymptotic expansion of T(z + 1) 
in powers of \ j z. In this limit, the exponential will be dominated by the part 
of the integration region near the absolute maximum of /(£) Now /(£) is a 
maximum at £ = 1 and 



/(C) = -i-|(C-i) 2 + 



(10.120) 
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= z z+1 e- z 




So 

POO 

T(z + 1) = z z+1 e- z / e-ftt- 1 )^- d( 

Jo 

/oo 
-oo 

2^ 

= V2^z z+1 ' 2 e- z . (10.121) 

By keeping more of the terms represented by the dots, and expanding 
them as 

e -f(C-D 2 +- = e -i(C-D 2 [! + fll(C _ i) + fl2(C _ !)2 + . . .] ^ (m22) 

we would find, on doing the integral, that 

nr- z +i/2 z L 1 1 139 571 / 1 

r (2+ i) « V2^+"V* + _ + _ - - + o ( ? 

(10.123) 

Since Y{n + 1) = n! we also have 



71 



! ss v^7rn n+1/2 e- n 



1+ ll + - 



(10.124) 



We make contact with our discusion of asymptotic series by rewriting the 
expansion as 

r( " +1) i | 1 | 1 139 571 | M0125) 

1 o .. ooo .,9. c 1 o a n -.3 o /I ooooon .A ~ • • • \ • I 



2nz z + 1 / 2 e- z I2z 288z 2 51840^ 3 24888320z 4 

This typical. We usually have to pull out a leading factor from the function 
whose asymptotic behaviour we are studying, before we are left with a plain 
asymptotic power series. 

10.4.2 Airy Functions 

The Airy functions Ai(x) and Bi(rr) are closely related to Bessel functions, 
and are named after the mathematician and astronomer George Biddell Airy. 
They occur widely in physics. We will investigate the behaviour of Ai(x) for 



426 



CHAPTER 10. SPECIAL FUNCTIONS II 



large values of \x\. A more sophisticated treatment is needed for this problem, 
and we will meet with Stokes' phenomenon. Airy's differential equation is 

d 2 y 



dz 2 



- zy = 0. 



On the real axis Airy's equation becomes 

d 2 y 



dx 2 



+ xy = 0, 



(10.126) 



(10.127) 



and we we can think of this as the Schrodinger equation for a particle running 
up a linear potential. A classical particle incident from the left with total 
energy E = will come to rest at x — 0, and then retrace its path. The point 
x = is therefore called a classical turning point. The corresponding quantum 
wavefunction, Ai(x), contains a travelling wave incident from the left and 
becoming evanescent as it tunnels into the classically forbidden region, x > 0, 
together with a reflected wave returning to — oo. The sum of the incident 
and reflected waves is a real-valued standing wave. 




10 



Figure 10.6: The Airy function, Ai (x). 

We will look for contour integral solutions to Airy's equation of the form 

(10.128) 



y(x)= / e xt f(t)dt. 
Jc 

Denoting the Airy differential operator by L x = d 2 — x, we have 



L x y = Jtf - x)e xt f{t)dt = Jj{t)\t 2 - j^e xt dt. 

= [-e xt f(t)] c + (({* 2 + !}/(*)) e ^- ( 10 - 12 9) 
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Thus f(t)=e s' 3 and 



y{x) 



z xt —^dt. 



10.130) 



c 



The contour must end at points where the integrated-out term, 

vanishes. There are therefore three possible contours, which end at any two 
of 



+oo, ooe 2m/3 , 



oo e 




Figure 10.7: Contours providing solutions of Airy's equation. 



Since the integrand is an entire function, the sum y Cl + yc 2 + Vc 3 is zero, so 
only two of the three solutions are linearly independent. The Airy function 
itself is defined by 

Ai (z) = — [ e xt ~^ 3 dt=-[ cos (xs + -s 3 ] ds (10.131) 
2™ Jcx 71 Jo V 3 / 

In obtaining last equality, we have deformed the contour of integration, C±, 
that ran from oo e~ 2m ^ to oo e 2m ' 3 so that it lies on the imaginary axis, 
and there we have written t = is. You may check (d la Jordan) that this 
deformation does not alter the value of the integral. 

To study the asymptotics of this function we need to examine separately 
two cases a;>0 and i<0. For both ranges of x, the principal contribution 
to the integral will come from the neighbourhood of the stationary points 
of f(t) = xt — t 3 /3. These stationary points are never pure maxima or 
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minima of the real part of / (the real part alone determines the magnitude 
of the integrand) but are always saddle points. We must deform the contour 
so that on the integration path the stationary point is the highest point 
in a mountain pass. We must also ensure that everywhere on the contour 
the difference between / and its maximum value stays real. Because of the 
orthogonality of the real and imaginary part contours, this means that we 
must take a path of steepest descent from the pass — hence the name of 
the method. If we stray from the steepest descent path, the phase of the 
exponent will be changing. This means that the integrand will oscillate and 
we can no longer be sure that the result is dominated by the contributions 
near the saddle point. 



\ 




a) \ 
u 


V 

— ■ u 




— 


/ 





Figure 10.8: Steepest descent contours and location and orientation of the 
saddle passes for a) x 3> 0, b) x <C 0. 

i) x ^> : The stationary points are at t = ±y/x. Writing t = £ — y/x have 

f{i) = -\^+ev^-\e (io.i32) 

while near t = +y/x we write t = ( + \fx and find 

/(C) = -^ 3/2 -CV^C 3 ( 10 - 133 ) 

We see that the saddle point near — yfx is a local maximum when we 
route the contour vertically, while the saddle point near +y/x is a local 
maximum as we go down the real axis. Since the contour in Ai (x) is 
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aimed vertically we can distort it to pass through the saddle point near 
—yfx, but cannot find a route through the point at +\/x without the 
integrand oscillating wildly. At the saddle point the exponent, xt—t 3 /3, 
is real. If we write t — u + iv we have 

Im (art - t 3 /3) = v(x - u 2 + v 3 /3), (10.134) 

so the exact steepest descent path, on which the imaginary part remains 
zero is given by the union of real axis (v — 0) and the curve 

u 2 -\? = x. (10.135) 

This is a hyperbola, and the branch passing through the saddle point 
at —\fx is plotted in a). 
Now setting £ = is, we find 

Ai ( x ) = —(rl* 3 ' 2 / e -Vi- a +- rfs ^ _!_ a .-i/4 e -|x3/ 2 _ (10.136) 

ii) a: <^ : The stationary points are now at ±i-^/|x[. Setting t — £ ± 
find that 

/(x) = =Fi||z| 3/2 Ti£ 2 >/H- (10.137) 

The exponent is no longer real, but the imaginary part will be constant 
and the integrand non-oscillatory provided we deform the contour so 
that it becomes the disconnected pair of curves shown in b). The 
new contour passes through both saddle points and we must sum their 
contributions. Near t = iy/\x\ we set £ = e 37 "/ 4 s and get 

2m J_ 00 2iy/ir 

- l—e-W^ia-i-i^g-iflxiva 



(10.138) 

Near t = — i-v/jxjwe set £ = e 2m ^s and get 



1 

e 

2m 



«i/4 e q\x\*/> [°° e -Vw^ rfs = 1^/4^1-1/4^11,1^ (10 139) 
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The sum of these two contributions is 
Ai (x) 



=. rm sin —\x 
V^\x\ 1/4 V3 



3/2 



7T 

+ 4 



(10.140) 



The fruit of our labours is therefore 



Ai (x) 



1 + ( - 

x 



1 _l/4 -2^,3/2 

7^ sm (f w3/! + i 



x > 0, 



l + O ( - 



x < 0. 



'10.141} 



Suppose that we allow x to become complex x — > z = \z\e td , with — n < 
9 < it. Then figure 10.9 shows how the steepest contour evolves and leads 
the two quite different expansion for positive and negative x. We see that 
for < 9 < 2n/3 the steepest descent path continues to be routed through 
the single stationary point at —y^\z\e l6 ^ 2 . Once 9 reaches 27r/3, though, 
it passes through both stationary points. The contribution to the integral 
from the newly aquired stationary point is, however, exponentially smaller 
as \z\ — > oo than that of t — — \p\z\e ld I 2 . The new term is therefore said to 
be subdominant , and makes an insignificant contribution to the asymptotic 
behaviour of Ai(z). The two saddle points only make contributions of the 
same magnitude when 9 reaches ir. If we analytically continue beyond 9 = it, 
the new saddlepoint will now dominate over the old, and only its contribtion 
is significant at large \z\. The Stokes line, at which we must change the form 
of the asymptotic expansion is therefore at 9 = ir. 

If we try to systematically keep higher order terms we will find, for the 
oscillating Ai (—z), a double series 



sin(p + 7r/4)^(-l) n c 2n p 

n=0 

oo 

cos(p + tt/4) ^(-l) n c 2n+1 p- 



-2n 



2n-l 



n=0 



10.142) 



where p = 2z 3 ^ 2 /3. In this case, therefore we need to extract two leading 
coefficients before we have asymptotic power series. 

The subject of asymptotics contains many subtleties, and the reader in 
search of a more detailed discussion is recommened to read Bender and 
Orszags Advanced Mathematical methods for Scientists and Engineers. 
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-2-10 1 2-2-10 1 2 



c) d) 

Figure 10.9: Evolution of the steepest-descent contour from passing through 
only one saddle point to passing through both. The dashed and solid lines are 
contours of the real and imaginary parts, repectively of (zt — t 3 /3) . 9 = Arg z 
takes the values a) 7tc/12, h) 15tt/24, c) 2tt/3, d) 9tt/12. 
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Exercise 10.2: Consider the behaviour of Bessel functions when x is large. By 
applying the method of steepest descent to the Hankel function contours show 
that 



Hl 2 \ 



2 , 




i(x— vir/2— 7r/4) 



irx 



1 - 



Au 2 - 1 




_ ,_i e -i(i-w/2-^) 



1 + 



8ttx 

4u 2 -l 
8irx 



+ 



+ ■•• 



and hence 
J„(x) - 
JV„(a;) - 




2 

TTX 

2 

TTX 



COS X 



VTT 

~~2 
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10.5 Elliptic Functions 
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The subject of elliptic functions goes back to remarkable identities of Guilio 
Fagnano (1750) and Leonhard Euler (1761). Euler's formula is 



r dx + r dy r 

Jo VI -x 4 Jo ^l-y A Jo 



dz 



10.143) 



where < u, v < 1, and 



1 + U 2 V 2 

This looks mysterious, but perhaps so does 

" u dx . f v dy r <lz 



(10.144) 



o 



where 



r = uVl — v 2 + vy/l — u 2 , 
until you realize that the latter formula is merely 

sin(a + b) = sin a cos b + cos a sin b 



(10.145) 



(10.146) 



10.147) 
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in disguise. To see this set 

w = sina, v — sinb (10.148) 
and remember the integral formula for the inverse trig function 

a = sin- 1 M= / dX . (10.149) 
Jo vl-x 2 

The Fagnano-Euler formula is a similarly disguised addition formula for an 
elliptic function. Just as we use the substitution x = sin y in the 1 / yl — x 2 
integral, we can use an elliptic function substitution to evaluate elliptic in- 
tegrals such as 



o \/(t — ai)(t — a 2 ){t — a 3 )(t — a± 
dt 



dt 

(10.150) 



, . (10.151) 

'0 y/(t - ai)(t - a 2 )(t - a 3 ) 

The integral I3 is a special case of I4, where 04 has been sent to infinity by 
use of a Mobius map 

, at + b dt 

Indeed, we can use a suitable Mobius map to send any three of the four 
points a n to 0, 1, 00. 

The idea of elliptic functions (as opposed to the integrals, which are their 
functional inverse) was known to Gauss, but Abel and Jacobi were the first 
to publish (1827). For the general theory, the simplest elliptic function is 
the Weierstrass p. This is defined by first selecting two linearly independent 
periods u>i, u 2 , and setting 

p( z ) = \+ E (? ^-7^ — 1 — ^i- ( 10 - 153 ) 



The sum is over integers m, n, positive and negative, but not both 0. Helped 
by the counterterm, the sum is absolutely convergent, so we can rearrange 
the terms to prove double periodicity 



p(z + mui + noj 2 ) = p(z). 



(10.154) 
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The function is thus determined everywhere by its values in the period paral- 
lelogram P = {Xuji + H0J2 : < A, // < 1}. Double periodicity is the defining 
characteristic of elliptic functions. 



Any non-constant meromorphic function, f(z), which is doubly periodic has 
four basic properties: 

a) The function must have at least one pole in its unit cell. Otherwise 
it would be holomorphic and bounded, and therefore a constant by 
Liouville. 

b) The sum of the residues at the poles must add to zero. This follows 
from integrating f(z) around the boundary of the period parallelogram 
and observing that the contributions from opposite edges cancel. 

c) The number of poles in each unit cell must equal the number of zeros. 
This follows from integrating /'// round the boundary of the period 
parallelogram. 

d) If / has zeros at the N points Zi and poles at the N points Pi then 



where m, n are integers. This follows from integrating zf j f round the 
boundary of the period parallelogram. 
The Weierstass p has a second-order pole at the origin. It also obeys 




Figure 10.10: Unit cell and double-periodicity. 



N N 



i=i i=i 
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9* = 60 E 7 1 u> ^3 = 140 £ ? ^. (10.157) 



p'(z) = -p'(-z). (10.155) 
The property that makes p(z) useful for evaluating integrals is 

{p\z)f = Ap\z) - g 2 p(z) - 03, (10.156) 

where 

i Vi , ^3 = 140 ^ ' 

(m,n)^0 v 7 (m,n)^0 

Equation (10.156) is proved by examining the first few terms in the Laurent 
expansion in z of the difference of the left hand and right hand sides. All 
negative powers cancel, as does the constant term. The difference is zero at 
z = 0, has no poles or other singularities, and being continuous and periodic is 
automatically bounded. It is therefore identically zero by Liouville's theorem. 

From the symmetry and periodicity of p we see that p\z) = at oui/2, 
uj 2 /2 and (uj 1 + uj 2 )/2 where p(z) takes values t\ = p(ui/2), e 2 = p(u 2 /2), 
and e3 = V((ui +uj 2 )/2). Now p' must have exactly three zeros since it has a 
pole of order three at the origin and, by property c), the number of zeros in 
the unit cell is equal to the number of poles. We therefore know the location 
of all three zeros and can factorize 

Ap\z) - g 2 p(z) -g 3 = 4(p - ei )(p - e 2 )(p - e 3 ). (10.158) 

We note that the coefficient of p 2 in the polynomial on the left side is zero, 
implying that t\ + e 2 + e 3 = 0. This is consistent with property d). 

The roots can never coincide. For example, (p(z) — ei) has a double 
zero at oui/2, but two zeros is all it is allowed because the number of poles 
per unit cell equals the number of zeros, and (p(z) — e\) has a double pole at 
as its only singularity. Thus (p — e\) cannot be zero at another point, but 
it would be if e\ coincided with e 2 or e 3 . As a consequence, the discriminant 

A = 16(ei - e 2 ) 2 (e 2 - e 3 ) 2 ( ei - e 3 ) 2 = g 3 2 - 27g 2 3 , (10.159) 

is never zero. 

We use p to write 

, r dt r dt 

z = p (u) 



2 v /(t-e 1 )(t-e 2 )(t-e 3 ) Joo V 'At* - g 2 t - g 3 

(10.160) 
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This maps the u plane cut from e\ to e 2 and e 3 to oo one-to-one onto the 
2-torus, regarded the unit cell of the ou njjn = nuo\ + moo 2 lattice. 

As z sweeps over the torus, the points x = p(z), y = p'(z) move on the 
elliptic curve 

y 2 =4x 3 -g 2 x-g 3 (10.161) 

which should be thought of as a set in CP 2 . These curves, and the finite fields 
of rational points that lie on them, are exploited in modern cryptography. 

The magic which leads to addition formula, such as the Euler-Fagnano 
relation with which we began this section, lies in the (not immediatley obvi- 
ous) fact that any elliptic function having the same periods as p(z) can be 
expressed as a rational function of p{z) and p'(z). From this it follows (after 
some thought) that any two such elliptic functions, fi(z) and f2(z), obey a 
relation F(f 1: f 2 ) = 0, where 

F(x,y) = J2 a n,mX n y m (10.162) 

is a polynomial in x and y. We can eliminate p'{z) in these relations at the 
expense of introducing square roots. 

modular invariance 

If uji and u 2 are periods and define a unit cell, so are 

lo[ = auji + &a>2 
uj' 2 — CU\ + dljJ 2 

where a, b, c, d are integers with ad — be = ±1. This condition on the deter- 
minant ensures that the matrix inverse also has integer entries, and so the oji 
can be expressed in terms of the uj[ with integer coefficients. Consequently 
the set of integer linear combinations of the lo\ generate the same lattice as 
the integer linear combinations of the original lo^. This notion of redefining 
the unit cell should be familiar to your from solid state physics. If we wish 
to preserve the orientation of the basis vectors, we must restrict ourselves 
to maps whose determinant ad — be is unity. The set of such transforms 
constitute the the modular group SL(2,Z). Clearly p is invariant under this 
group, as are g 2 and g 3 and A. Now define u; 2 /^i = t, and write 

g 2 (u!, u 2 ) = \, g 2 (r), g 3 (u u u 2 ) = \, g 3 (r). A(u u u 2 ) = A(r), 

UJl UJ^ OJi 

(10.163) 
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and also 



J(r) 



~9l 



~9l 



M ~ 27^ A 



(10.164) 



Because the denominator is never zero when Imr > 0, the function J(r) is 
holomorphic in the upper half-plane — but not on the real axis. The function 
J(r) is called the elliptic modular function. 

Except for the prefactors u;™, the functions gi{r), A(r) and J(r) are 
invariant under the Mobius transformation 

aT + b (10.165) 



with 



cr + d 



E SL(2,Z). 



(10.166) 



This Mobius transformation does not change if the entries in the matrix are 
multiplied by a common factor of ±1, and so the transformation is an element 
of the modular group PSL(2, Z) = SL(2, Z)/{J, -/}. 

Taking into account the change in the cu" prefactors we have 

?A (cT + d)%(T), 



92 



93 



CT - 
CLT 



CT + d 

or + b 



Because c = and d 



ct + d 
1 for the special case r 



(cr + d)%(r), 



(cr + <i) A(t). 



(10.167) 



t + 1, these three functions 



obey /(r + 1) — / (t) and so depend on r only via the combination q 2 
For example, it is not hard to prove that 

oo 

2n\24 



A(r) = (2 7 r) 1 Vn( 1 -^ n ) 



10.168) 



n=l 



We can also expand them as power series in q 2 — and here things get interest- 
ing because the coefficients have number-theoretic properties. For example 



Hr) = (2tt) 4 
Hr) = (2tt) 6 



i2 +20 5> 3 ( n )9 

n=l 

1 7 °° 

Ifi Q Z_-/ 



a 5 {n)q 



2n 



n=l 



(10.169) 
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The symbol o- k (n) is defined by o~ k {n) = ^d k where d runs over all positive 
divisors of the number n. 

In the case of the function J(r), the prefactors cancel and 

J (?L±>\ = J { r), (X0.170) 



k ct + d 

so J(r) is a modular invariant. One can show that if J{ji) = J(r 2 ), then 

„ = ^ (10.171) 
cri + a 

for some modular transformation with integer a,b,c,d, where ad — be = 1, 
and further, that any modular invariant function is a rational function of 
J(t). It seems clear that J(r) is rather a special object. 

This J(t) is the function referred to on page 174 in connection with the 
Monster group. As with the c/i, J(r) depends on r only through q 2 . The first 
few terms in the power series expansion of J(r) in terms of q 2 turn out to be 

1728J(r) = <T 2 + 744 + 196884g 2 + 21493760g 4 +864299970g 6 + - • • . (10.172) 

Since AJ{r) + B has all the same modular invariance properties as J(t), the 
numbers 1728 = 12 3 and 744 are just conventional normalizations. Once we 
set the coefficient of q~ 2 to unity, however, the remaining integer coefficients 
are completely determined by the modular properties. A number-theory 
interpretation of these integers seemed lacking until John McKay and others 
observed that that 

1 = 1 
196884 = 1 + 196883 
21493760 = 1 + 196883 + 21296786 
864299970 = 2 x 1 + 2 x 196883 + 21296786 + 842609326, 

(10.173) 

where "1" and the large integers on the right-hand side are the dimensions of 
the smallest irreducible representations of the Monster. This "Monstrous 
Moonshine" was originally mysterious and almost unbelievable, ("moon- 
shine" = "fantastic nonsense") but it was explained by Richard Borcherds 
by the use of techniques borrowed from string theory. 3 Borcherds received 
the 1998 Fields Medal for this work. 



3 u 



I was in Kashmir. I had been traveling around northern India, and there was one 
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10.6 Further Exercises and Problems 

Exercise 10.3: Show that the binomial series expansion of (1 + x)~ u can be 
written as 

m=0 v ' 

Exercise 10.4: A Mellin transform and its inverse. Combine the Beta-function 
identity (10.15) with a suitable change of variables to evaluate the Mellin 
transform 



/ x s-1 (l +x)-"dx, u>0, 
Jo 



of (1 + x) v as a product of Gamma functions. Now consider the integral 

(•c+ioo 



2m 



— / x- s T{u-s)T{s)ds. 

1 \V ) J c—ioo 



Here Rec £ (0, v). The contour therefore runs parallel to the imaginary axis 
with the poles of T(s) to its left and the poles of T(y — s) to its right. Use the 
identity 

r(s)r(l — s) = 7TCOSeC7TS 

to show that when \x\ < 1 the contour can be closed by a large semicircle lying 
to the left of the imaginary axis. By using the preceding exercise to sum the 
contributions from the enclosed poles at s = —n, evaluate the integral. 

Exercise 10.5: Mellin-Barnes integral. Use the technique developed in the 
preceding exercise to show that 

cv , ^ r ( c ) f +i °° _ s r(a-a)r(6-s)r(s) , 

F(a, b, c:-x) = ^-tt^tit / x — — , — ds, 

y ' 2TTiT{a)T(b) J c _ ioo T(c-s) 

for a suitable range of x. This integral representation of the hyper geometric 
function is due to the English mathematician Ernest Barnes (1908), later a 
controversial Bishop of Birmingham. 



really long tiresome bus journey, which lasted about 24 hours. Then the bus had to stop 
because there was a landslide and we couldn't go any further. It was all pretty darn 
unpleasant. Anyway, I was just toying with some calculations on this bus journey and 
finally I found an idea which made everything work"- Richard Borcherds (Interview in 
The Guardian, August 1998). 
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Exercise 10.6: Let 

' Vi 

.2/2 

Show that the matrix differential equation 



Y 



A= a \ B (0 



where 

A-( . . , 

1-cJ ' V & a + 6-c+l 

has a solution 

Y(z) = F(a, b, ; c, z) ( J ) + ^F'(a, ft; c; *) ( J ) . 

Exercise 10.7: Kniznik-Zamolodchikov equation. The monodromy properties 
of solutions of differential equations play an important role in conformal field 
theory. The Fuchsian equations studied in this exercise are obeyed by the 
correlation functions in the level-A: Wess-Zumino-Witten model. 

Let V( a \ a = 1, ... n, be spin-j a representation spaces for the group SU(2). Let 
W(z\, . . . , z n ) be a function taking values in <8> ® ■ ■ ■ <8> F^ n ). (In other 
words If is a function i n (zi, . . . , z n ) where the index i a labels states in 

the spin-j a factor.) Suppose that W obeys the Kniznik-Zamolodchikov (K-Z) 
equations 

Q j(a) . J (6) 

(k + 2)—W=J2 Z W ' a = l,...,n, 

Oz a Z a Zf) 

b,bfa 

where 

T (o) t(6) = t(o) j(b) j(a) j(b) j(a) Jb) 
J «J — i/j i/j ~T J2 J 2 t ^3 J 3 ' 

and indicates the su(2) generator Jj acting on the factor in the tensor 
product. If we set z\ = z, for example and fix the position of Z2, ■ ■ ■ z n , then 
the differential equation in z has regular singular points at the n — 1 remaining 
z b . 

a) By diagonalizing the operator ■ show that there are solutions 
W(z) that behave for z a close to z b as 



W(z) ~ (z a - z 6 )^ 



A,— A,- —A 



■3a "Jfc 



where 



J A; + 2 ' ° a k + 2 ' 
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and j is one of the spins \j a — j b \ < j < j\ + j a occuring in the decompo- 
sition of j a <g>j b . 
b) Define covariant derivatives 

_ 8 J(°) • 

b,b^=a 

and show that [V tt , V&] = 0. Conclude that the effect of parallel transport 
of the solutions of the K-Z equations provides a representation of the 
braid group of the world lines of the z a . 
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